Reputation: 161
I am working on an information retrieval system using MySQL's with the natural language mode. The data I have is annotated to considering different categories. Eg. Monkey, cat, dog will be annotated as 'animals' whereas duck, sparrow as 'birds'. The problem is that I am retrieving documents based on the occurrences of these tags.
Now MySQL has a limitation that if a particular term comes in more than 50% in the entire data that term is not considered. Considering my requirement I want it to score all the matching terms even if a particular term comes more than 50% in the entire data.
I have read few things about combination of Sphinx with MySQL for search efficiency but I am not sure whether this could be applied for my situation.
Please provide a solution for this problem
Upvotes: 1
Views: 395
Reputation: 2027
Sphinx is very good at very fast fulltext search. It doesn't have the 50% rule that mySQL has, but you will need to use it in place of mySQL's fulltext search. Basically what you do is install Sphinx and set up an import to copy all your mySQL data into Sphinx. Then you can build SphinxSE or query Sphinx directly through a library to get your results. You can then get the details of your results by querying mySQL.
I use SphinxSE because you can query Sphinx through mySQL and join your mySQL table to the results in a single query. It's quite nice.
Upvotes: 1