Komputer & Telekomunikasi    
   
Daftar Isi
(Sebelumnya) Spheres of ChaosSpiceworks (Berikutnya)

Sphinx (search engine)

Sphinx
Sphinx search logo.jpg
Developer(s)Sphinx Technologies Inc
Initial release2001
Stable release2.0.6 / October 2012; 4 months ago (2012-10)
Written inC++
Operating systemLinux, Windows, Solaris, FreeBSD, NetBSD, Mac OS, AIX
TypeSearch engine
LicenseGPLv2 or proprietary[1]
Websitehttp://sphinxsearch.com/

Sphinx is a free software search engine designed with indexing database content in mind. By design, Sphinx databases can be gracefully integrated with SQL databases.

Sphinx can be used via one of the following ways:

  • as a stand-alone server (just like other DBMS's);
  • it can communicate with other DBMS's:
  • using a Storage Engine for MySQL and its forks, called SphinxSE. MariaDB is distributed with that Storage Engine.

If Sphinx is executed as a stand-alone server, it is possible to use SphinxAPI to connect an application to it. Official implementations of the API are available for PHP, Java, Perl, Ruby and Python languages. They all are distributed along with Sphinx.

Other data sources can be indexed via pipe in a custom XML format. It is distributed under the terms of the GNU General Public License version two or a proprietary license.[1]

Starting from version 0.9.9, querying is possible using SphinxQL, a subset of SQL. Starting from version 1.10-beta, both incremental (via Real-Time backend[2]) and batch indexing is supported.

More than 400 web sites and services stated that they use Sphinx, including Craigslist.org.[3]

Contents

Features

  • Batch and incremental (soft real-time) full-text indexing.
  • Support for non-text attributes (scalars, strings, sets).
  • Direct indexing of SQL databases. Native support for MySQL, MariaDB, PostgreSQL, MSSQL, plus ODBC connectivity.
  • XML documents indexing support.
  • Distributed searching support out of the box.
  • Integration via access APIs.
  • SQL-like syntax support via MySQL protocol (since 0.9.9)
  • Full-text searching syntax.
  • Database-like result set processing.
  • Relevance ranking utilizing additional factors besides standard BM25.
  • Text processing support for SBCS and UTF-8 encodings, stopwords, indexing of words known not to appear in the database ("hitless"), stemming, word forms, tokenizing exceptions, and "blended characters" (dual-indexing as both a real character and a word separator).
  • Supports UDF (since 2.0.1).

Performance and scalability

  • Indexing speed of up to 10-15 MB/sec per core and HDD.
  • Searching speed of over 500 queries/sec against 1,000,000-document using a 2-core desktop system with 2 GB of RAM. [4]
  • The biggest known installation using Sphinx, Boardreader.com, indexes 16 billion documents.
  • The busiest known installation, Craigslist, is rumored to serve over 200,000,000 queries/day. [5]

See also

References

External links

Further reading

(Sebelumnya) Spheres of ChaosSpiceworks (Berikutnya)