Reputation: 43
This is a part of big query that are generated dynamically depending on the on post params ...
there is a table user_params
id | user_id | param_id | param_value
--------------------------------------
1 | 5 | 2 | audi
--------------------------------------
2 | 5 | 3 | a4
--------------------------------------
3 | 5 | 4 | silver
--------------------------------------
4 | 7 | 2 | audi
--------------------------------------
5 | 7 | 3 | q3
--------------------------------------
6 | 8 | 2 | ford
--------------------------------------
7 | 8 | 3 | fiesta
--------------------------------------
I want to extract all 'user_id' where 'param_id' = 1 , 'param_value' = 'audi' , 'param_id' = 3 , 'param_value' = 'a4'
So in this instance my result should contain id
rows 1,2,4,5,7.
What is the best way considering that this table has more than 50K entries ?
It's an InnoDB table.
+++ Now I see there is a html form with multiselect fields for these table. "Give me all users (user_id) who drive audi (a4 or q3) or VW (passat) which are silver color".
I'm testing solutions that you've written.
Upvotes: 1
Views: 106
Reputation: 76395
Ok, I've actually gone ahead and created a test table based on what you've posted here as example data, here's the create statement:
CREATE TABLE `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) DEFAULT NULL,
`param_id` int(11) DEFAULT NULL,
`param_value` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Next, here's how I populated it with the data
INSERT INTO `test`
(
`user_id`,
`param_id`,
`param_value`
)
VALUES
(5,2, 'audi'),
(5,3, 'a4'),
(5,4, 'silver'),
(7,2, 'audi'),
(7,3, 'q3'),
(8,2, 'ford'),
(8,3, 'fiesta');
Then, here are the various queries I've tried:
/** using OR + DISTINCT **/
SELECT DISTINCT user_id
FROM test
WHERE (param_id = 1 AND param_value = 'audi')
OR (param_id = 3 AND param_value = 'a4')
OR (param_id = 4 AND param_value = 'silver')
/** using OR + GROUP BY **/
SELECT user_id
FROM test
WHERE (param_id = 1 AND param_value = 'audi')
OR (param_id = 3 AND param_value = 'a4')
OR (param_id = 4 AND param_value = 'silver')
GROUP BY user_id
/** Using tripple join **/
SELECT
t1.user_id
FROM
test AS t1,
test AS t2,
test AS t3
WHERE
t1.user_id = t2.user_id AND t2.user_id=t3.user_id AND
t1.param_id = 2 AND t1.param_value = 'audi' AND
t2.param_id = 3 AND t2.param_value = 'a4' AND
t3.param_id = 4 AND t3.param_value = 'silver'
/** JOIN with ON clause instead of WHERE **/
SELECT
t1.user_id
FROM
test AS t1
JOIN
test AS t2
ON t1.user_id = t2.user_id
JOIN
test AS t3
ON t3.user_id = t1.user_id
WHERE
t1.param_id = 2 AND t1.param_value = 'audi' AND
t2.param_id = 3 AND t2.param_value = 'a4' AND
t3.param_id = 4 AND t3.param_value = 'silver'
All of these queries produce the same result. However, the way the result is obtained is very different indeed, which shows in the EXPLAIN EXTENDED
output, which you can easily see for yourself now by running all of these queries, and prepending EXPLAIN EXTENDED
, because I'm just too lazy to paste all of the output here ATM
What you should see is that the JOIN
queries all show up as a join-type ALL
, meaning the entire table is scanned (3 times over), which the rows
count reflects, each JOIN affects all 7 rows (so 21 in total).
Compare that to the alternative, JOIN
-less queries, which only have to scan 7 rows, and I think it's fairly likely that joins probably aren't the answer in this case.
Upvotes: 1
Reputation: 15827
You have to perform multiple joins on the same table.
One way to do it is like this
SELECT
t1.user_id
FROM
thetable AS t1,
thetable AS t2,
thetable AS t3
WHERE
t1.user_id = t2.user_id AND t2.user_id=t3.user_id AND
t1.param_id = 1 AND t1.param_value = 'audi' AND
t2.param_id = 3 AND t2.param_value = 'a4' AND
t3.param_id = 4 AND t3.param_value = 'silver'
This is equivalent as using JOIN
statements, but it's just to give the idea.
Of course the query have to be extended or reduced depending on how many parameters your're checking (3 in the example case).
If all the fields are indexed the query should run quite fast on 50K records.
Upvotes: 2
Reputation: 263
If you want to speed things up you can ADD indexes to the affected columns. But actually no DBM should have a problem with 50k data sets. It's not that much.
Upvotes: 1
Reputation: 585
Please try below where condition in query
where param_id in (1,3) && param_value in ('audi','a4')
Upvotes: 0
Reputation: 319
You can't get any values if you're trying it with AND everytime. What it does is he tries to get a user_id with a param_id of 1 and has a value of audi and also a param_id of 3 and also a param_value of a4. You need to try it with OR and i dont see any param_id of 1 so make sure that one exists. If i understand your question correctly you need to do something like
SELECT user_id FROM tablename WHERE 'param_id' = 1 AND 'param_value' = 'audi' OR 'param_id' = 3 AND 'param_value' = 'a4'
Upvotes: 0
Reputation: 3081
Well, the term 'param_id' = 1 AND 'param_value' = 'audi' AND 'param_id' = 3 AND 'param_value' = 'a4'
would always be false, so I just assume you want to use (param_id = 1 AND param_value = 'audi') OR (param_id = 3 AND 'param_value' = 'a4')
In that case the query could be like: select distinct user_id where (param_id = 1 AND param_value = 'audi') OR (param_id = 3 AND 'param_value' = 'a4')
Upvotes: 0