Reputation: 10417
I've an existing site, whose DB is not designed correctly and contains lot of records, so we cant change DB structure.
Database for current issue mainly contains 4 tables, users, questions, options and answers. There is standard set of questions and options but for each user, there is one row in answers table for each set of question and options. DB structure and example data is available at SQL fiddle.
Now as a new requirement of advanced search, I need to find users by applying multiple search filters. Example input and expected output is given in comments on SQL Fiddle.
I tried to apply all type of joins, intersection but it always fail somehow. Can someone please help me to write correct query, preferably light weight/optimized joins as DB contain lot of records (10000+ users, 100+ questions, 500+ options and 500000+ records in answers table)?
EDIT: Based on two answers, I used following query
SELECT u.id, u.first_name, u.last_name
FROM users u
JOIN answers a ON a.user_id = u.id
WHERE (a.question_id = 1 AND a.option_id IN (3, 5))
OR (a.question_id = 2 AND a.option_id IN (8))
GROUP BY u.id, u.first_name, u.last_name
HAVING
SUM(CASE WHEN (a.question_id = 1 AND a.option_id IN (3, 5)) THEN 1 ELSE 0 END) >=1
AND SUM(CASE WHEN (a.question_id = 2 AND a.option_id IN (8)) THEN 1 ELSE 0 END) >= 1;
Please note: On real database, columns user_id
, question_id
and option_id
of answers
table are indexed.
Running query given on SQL Fiddle.
SQL Fiddle for dnoeth's answer.
SQL Foddle for calcinai's answer.
Upvotes: 1
Views: 273
Reputation: 60472
Add all you n filters into the WHERE using OR and repeat them in a HAVING(SUM(CASE)) using AND:
SELECT u.id, u.first_name, u.last_name
FROM users u JOIN answers a
ON a.user_id = u.id
JOIN questions q
ON a.question_id = q.id
JOIN question_options o
ON a.option_id = o.id
WHERE (q.question = 'Language known' AND o.OPTION IN ('French','Russian'))
OR (q.question = 'height' AND o.OPTION = '1.51 - 1.7')
GROUP BY u.id, u.first_name, u.last_name
HAVING
SUM(CASE WHEN (q.question = 'Language known' AND o.OPTION IN ('French','Russian')) THEN 1 ELSE 0 END) >=1
AND
SUM(CASE WHEN (q.question = 'height' AND o.OPTION = '1.51 - 1.7') THEN 1 ELSE 0 END) >= 1
;
I changed your joins into the more readable Standard SQL syntax.
Upvotes: 1
Reputation: 2617
This will require a bit of fiddling for a dynamic filter, but what you really want to do is search by the IDs, as it'll mean less joins and a faster query.
This produces the results you'd expect. I assume that the search filters are generated based off options in the database, so instead of passing the actual value back in to the query, pass the ID instead.
The multiple inner joins are to support multiple AND criteria and auto-reduce your result set.
SELECT * FROM users u
INNER JOIN answers a ON a.user_id = u.id
AND (a.question_id, a.option_id) IN ((1,3),(1,5)) # q 1: Lang, answer 3/5: En/Ru
INNER JOIN answers a2 ON a2.user_id = u.id
AND (a2.question_id, a2.option_id) = (2,8) # q 2: Height, answer 8: 1.71...
GROUP BY u.id;
I'd suggest making sure there's an index on (user_id, question_id, option_id) for searching:
ALTER TABLE `answers` ADD INDEX idx_search(`user_id`, `question_id`, `option_id`);
Otherwise it should be using primary keys for the joins (if properly defined) so it will be fast.
Upvotes: 1