Update:Remora/Search Revamp: Difference between revisions
(New page: = Problem Definition = The current search is slow and has a way of bringing the database to its knees. It also does not answer queries in the most useful way (see bugs for details). = Se...) |
|||
| (2 intermediate revisions by the same user not shown) | |||
| Line 23: | Line 23: | ||
===Sphinx=== | ===Sphinx=== | ||
http://www.sphinxsearch.com/ | |||
Sphinx can FTI existing InnoDB tables among other datasources and has a built in PHP API. | Sphinx can FTI existing InnoDB tables among other datasources and has a built in PHP API. | ||
| Line 28: | Line 30: | ||
===Zend_Search_Lucene=== | ===Zend_Search_Lucene=== | ||
I'd recommend this over straight Lucene as it's a PHP native implementation with PHP API and therefore matches the team skillset a little better (rather than having to code in Java and use the infamously buggy PHP/Java bridge). | http://framework.zend.com/manual/en/zend.search.lucene.html | ||
I'd recommend this over straight Lucene (http://lucene.apache.org/) as it's a PHP native implementation with PHP API and therefore matches the team skillset a little better (rather than having to code in Java and use the infamously buggy PHP/Java bridge). | |||
A significant problem is referenced here | A significant problem is referenced here | ||
| Line 39: | Line 43: | ||
Requires tables to be MyISAM which is not ideal, but is a possibility. | Requires tables to be MyISAM which is not ideal, but is a possibility. | ||
== | = Recommendations = | ||
* Implement Sphinx | * Fix the smaller bugs in 3.4.3 and 3.4.4 | ||
* Implement Sphinx before the final release of Fx3 | |||
Latest revision as of 15:03, 21 May 2008
Problem Definition
The current search is slow and has a way of bringing the database to its knees. It also does not answer queries in the most useful way (see bugs for details).
3.4.3
- bug 378657, Exact Name hits should sort first
- bug 419057, Add support for querying by os and app version to search and recommended calls
3.4
- bug 400986, Optimize search performance
- bug 433741, Search translation in current locale + en-US only
3.x triaged
- bug 401849, Implement special search intersession for these terms
Plan of attack
Implement bug 378657, bug 419057, and bug 401849 now, as in 3.4.3 or potentially some in 3.4.4 if there is a 3.4.4. Possibly implement bug 433741 depending on ease and timing for FTS solution.
Implement a FTS engine to solve bug 400986 and make bug 433741 redundant long term.
FTS Options
Sphinx
Sphinx can FTI existing InnoDB tables among other datasources and has a built in PHP API.
One limitation is that Sphinx only does case folding on English and Russian.
Zend_Search_Lucene
http://framework.zend.com/manual/en/zend.search.lucene.html
I'd recommend this over straight Lucene (http://lucene.apache.org/) as it's a PHP native implementation with PHP API and therefore matches the team skillset a little better (rather than having to code in Java and use the infamously buggy PHP/Java bridge).
A significant problem is referenced here http://framework.zend.com/manual/en/zend.search.lucene.index-creation.html According to the PHP documentation, "flock() will not work on NFS and many other networked file systems Do not use networked file systems with Zend_Search_Lucene."
This seems to rule out ZSL.
MySQL native FTS
Requires tables to be MyISAM which is not ideal, but is a possibility.
Recommendations
- Fix the smaller bugs in 3.4.3 and 3.4.4
- Implement Sphinx before the final release of Fx3