Reputation: 6251
I'd like to build a site-wide search for a website where all the content (or at least the searchable content) is to be stored in a database. The best way I can think of doing this without getting extremely involved is as follows:
It wouldn't be too complicated to implement things like giving more value to results with the search terms in the page title, or allowing users to search for multi-word phrases by using quotes.
Aside from performance considerations - limiting results returned, caching etc, is there anything else I need to consider or a better way of approaching this (aside from implementing a Google Search box)?
Upvotes: 2
Views: 858
Reputation: 5894
Have you considered Full Text Searching? It's not suitable in every case, but can help with this sort of problem.
SELECT *
FROM articles
WHERE MATCH (title, body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);
Be sure to read the docs though, because there are some interesting gotchas that get new users, for example:
If you create a table and insert only one or two rows of text into it, every word in the text occurs in at least 50% of the rows. As a result, no search returns any results. Be sure to insert at least three rows, and preferably many more.
Upvotes: 0
Reputation: 964
Not sure what the threshold for being extremely involved is, but I would probably search for matches that contain the entire array of strings first, then invoke the method you described.
Consider: two pieces of content that would be returned as separate results.
Result 1:
____ brown ____ ____ _____ ____ brown ____ ____ ______ ___ brown _____ ____ brown
Result 2:
brown leather sofas _____ _____ ______ ____ _____.
Obviously we would want to return result 2 as the top result, however your method would assign more "points" to result 1.
Upvotes: 1