Anonymous
Anonymous

Reputation: 6251

Basic site-wide search methodology?

I'd like to build a site-wide search for a website where all the content (or at least the searchable content) is to be stored in a database. The best way I can think of doing this without getting extremely involved is as follows:

  1. User enters search query - "brown leather sofas".
  2. Split query into an array.
  3. Search database (mysql) using LIKE %$val% for each row of the array.
  4. Load results into an array, then give each result +1 point for the number of search terms found in the content.
  5. If results have matching amounts of terms, then order by the number of views a particular page has had - an indicator of popularity.

It wouldn't be too complicated to implement things like giving more value to results with the search terms in the page title, or allowing users to search for multi-word phrases by using quotes.

Aside from performance considerations - limiting results returned, caching etc, is there anything else I need to consider or a better way of approaching this (aside from implementing a Google Search box)?

Upvotes: 2

Views: 858

Answers (2)

Cylindric
Cylindric

Reputation: 5894

Have you considered Full Text Searching? It's not suitable in every case, but can help with this sort of problem.

SELECT * 
FROM articles
WHERE MATCH (title, body)
AGAINST ('database' IN NATURAL LANGUAGE MODE);

Be sure to read the docs though, because there are some interesting gotchas that get new users, for example:

If you create a table and insert only one or two rows of text into it, every word in the text occurs in at least 50% of the rows. As a result, no search returns any results. Be sure to insert at least three rows, and preferably many more.

Upvotes: 0

Tom A
Tom A

Reputation: 964

Not sure what the threshold for being extremely involved is, but I would probably search for matches that contain the entire array of strings first, then invoke the method you described.

Consider: two pieces of content that would be returned as separate results.

Result 1:

____ brown ____ ____ _____ ____ brown ____ ____ ______ ___ brown _____ ____ brown

Result 2:

brown leather sofas _____ _____ ______ ____ _____.

Obviously we would want to return result 2 as the top result, however your method would assign more "points" to result 1.

Upvotes: 1

Related Questions