Reputation: 9198
What is the best and optimised way to write a SQL query to search for a term with multiple columns in a table?
For example I have a table of products:
id | title | color_id
-------------------
1 | Dress | 1 (red)
2 | T-shirt | 3 (blue)
3 | Pants | 2 (green)
4 | Socks | 1 (red)
5 | Dress | 2 (green)
6 | Shoes | 2 (green)
7 | Pants | 3 (blue)
And a table of colors:
id | color
----------
1 | Red
2 | Green
3 | Blue
And if a user enters in the term Red dress, as a result (s)he must see Product
with id 1
, and if user enters simply Red, as a result (s)he must see Products
with id 1
and 4
.
Update: There may be some inputs like dress red
or red blue
too.
Real version of tables are more complex, but I tried to explain in a simplest way.
Upvotes: 4
Views: 350
Reputation: 49089
My idea is to use a query like this:
set @search = 'red dress';
SELECT *
FROM
products INNER JOIN colors
ON products.color_id = colors.id
WHERE
(FIND_IN_SET(title, REPLACE(@search, ' ', ','))>0)+
(FIND_IN_SET(color, REPLACE(@search, ' ', ','))>0) =
LENGTH(@search)-LENGTH(REPLACE(@search, ' ', ''))+1;
The two replaces inside FIND_IN_SET are used to make a list of properties separated by commas:
red,dress
I'm then checking if title
is in this set. If it is present, the value of:
(FIND_IN_SET(title, REPLACE(@search, ' ', ','))>0)
will be evaluated to 1, 0 otherwise. The same is for color
.
The number of the properties in the search string can be calculated as:
LENGTH(@search)-LENGTH(REPLACE(@search, ' ', ''))+1
(yes, it is a dirty trick!). If the number of the properties that matches is the same as the properties in the @search string, the row will be returned. Please notice that performances will be poor.
Fiddle is here.
Upvotes: 1
Reputation: 29809
A simple solution that just "works" could be:
SELECT *
FROM products
JOIN colors ON colors.id = products.color_id
WHERE
( title = "red" AND color = "dress" ) OR
( color = "red" AND title = "dress" )
This condition might perform better, if the optimiser is dumb enough not to notice it by itself:
WHERE
( title = "red" OR title = "dress") AND
( color = "red" OR color = "dress")
If you add more properties to the problem (besides "title" and "color"), and if and if storage is not an issue, you may want to merge (and duplicate) all textual properties into a single VARCHAR
column, and run a fulltext search on this column.
CREATE TABLE products_properties (
product_id INT NOT NULL PRIMARY KEY,
properties VARCHAR (255),
FOREIGN KEY fk_product (product_id) REFERENCES products(id),
FULLTEXT ftx_properties (properties)
);
The search becomes very simple:
SELECT products.*, colors.*
FROM products
JOIN colors ON colors.id = products.color_id
JOIN products_properties AS pp ON pp.product_id = products.id
WHERE MATCH(properties) AGAINST ("+red +dress")
This obviously makes no sense in this particular example, but the more properties, the more it will speed up the query. This denormalisation also comes at the cost of an increased complexity to maintain the products_properties
table.
Now the problem becomes really hairy if you
But it looks like this is outside of the scope of your question.
Upvotes: 2
Reputation: 60539
If I were doing this in SQL, I would typically break down the query into individual words before doing the query, then build the query dynamically based on how many words there were.
So for your example, the query might end up looking like this:
SELECT *
FROM products p
JOIN colors c ON c.id = p.color_id
WHERE
p.title LIKE '%red%' OR
c.color LIKE '%red%' OR
p.title LIKE '%dress%' OR
c.color LIKE '%dress%'
If you have lots of tables though, this can start to get pretty complicated. It's also not very efficient as its unlikely any indexes will be usable.
A better solution would be to use a dedicated text indexing product like Lucene (but that's a whole nother question..)
Upvotes: 1