Update:Remora/Search Revamp: Difference between revisions

From MozillaWiki
Jump to navigation Jump to search
(New page: = Problem Definition = The current search is slow and has a way of bringing the database to its knees. It also does not answer queries in the most useful way (see bugs for details). = Se...)
 
 
(2 intermediate revisions by the same user not shown)
Line 23: Line 23:


===Sphinx===  
===Sphinx===  
http://www.sphinxsearch.com/
Sphinx can FTI existing InnoDB tables among other datasources and has a built in PHP API.
Sphinx can FTI existing InnoDB tables among other datasources and has a built in PHP API.


Line 28: Line 30:


===Zend_Search_Lucene===
===Zend_Search_Lucene===
I'd recommend this over straight Lucene as it's a PHP native implementation with PHP API and therefore matches the team skillset a little better (rather than having to code in Java and use the infamously buggy PHP/Java bridge).
http://framework.zend.com/manual/en/zend.search.lucene.html
 
I'd recommend this over straight Lucene (http://lucene.apache.org/) as it's a PHP native implementation with PHP API and therefore matches the team skillset a little better (rather than having to code in Java and use the infamously buggy PHP/Java bridge).


A significant problem is referenced here
A significant problem is referenced here
Line 39: Line 43:
Requires tables to be MyISAM which is not ideal, but is a possibility.
Requires tables to be MyISAM which is not ideal, but is a possibility.


===Recommendation===
= Recommendations =
* Implement Sphinx
* Fix the smaller bugs in 3.4.3 and 3.4.4
* Implement Sphinx before the final release of Fx3

Latest revision as of 15:03, 21 May 2008

Problem Definition

The current search is slow and has a way of bringing the database to its knees. It also does not answer queries in the most useful way (see bugs for details).

Search rewrite related bugs

3.4.3

  • bug 378657, Exact Name hits should sort first
  • bug 419057, Add support for querying by os and app version to search and recommended calls

3.4

  • bug 400986, Optimize search performance
  • bug 433741, Search translation in current locale + en-US only

3.x triaged

  • bug 401849, Implement special search intersession for these terms


Plan of attack

Implement bug 378657, bug 419057, and bug 401849 now, as in 3.4.3 or potentially some in 3.4.4 if there is a 3.4.4. Possibly implement bug 433741 depending on ease and timing for FTS solution.

Implement a FTS engine to solve bug 400986 and make bug 433741 redundant long term.

FTS Options

Sphinx

http://www.sphinxsearch.com/

Sphinx can FTI existing InnoDB tables among other datasources and has a built in PHP API.

One limitation is that Sphinx only does case folding on English and Russian.

Zend_Search_Lucene

http://framework.zend.com/manual/en/zend.search.lucene.html

I'd recommend this over straight Lucene (http://lucene.apache.org/) as it's a PHP native implementation with PHP API and therefore matches the team skillset a little better (rather than having to code in Java and use the infamously buggy PHP/Java bridge).

A significant problem is referenced here http://framework.zend.com/manual/en/zend.search.lucene.index-creation.html According to the PHP documentation, "flock() will not work on NFS and many other networked file systems Do not use networked file systems with Zend_Search_Lucene."

This seems to rule out ZSL.

MySQL native FTS

Requires tables to be MyISAM which is not ideal, but is a possibility.

Recommendations

  • Fix the smaller bugs in 3.4.3 and 3.4.4
  • Implement Sphinx before the final release of Fx3