Divebo Diving
Divebo Diving

Reputation: 43

Multiple query on same table

This is a part of big query that are generated dynamically depending on the on post params ...

there is a table user_params

id |  user_id | param_id | param_value 
--------------------------------------
1  | 5        |  2       | audi
--------------------------------------
2  | 5        |  3       | a4
--------------------------------------
3  | 5        |  4       | silver
--------------------------------------
4  | 7        |  2       | audi
--------------------------------------
5  | 7        |  3       | q3
--------------------------------------
6  | 8        |  2       | ford
--------------------------------------
7  | 8        |  3       | fiesta
--------------------------------------

I want to extract all 'user_id' where 'param_id' = 1 , 'param_value' = 'audi' , 'param_id' = 3 , 'param_value' = 'a4'

So in this instance my result should contain id rows 1,2,4,5,7.

What is the best way considering that this table has more than 50K entries ?

It's an InnoDB table.

+++ Now I see there is a html form with multiselect fields for these table. "Give me all users (user_id) who drive audi (a4 or q3) or VW (passat) which are silver color".

I'm testing solutions that you've written.

Upvotes: 1

Views: 106

Answers (6)

Elias Van Ootegem
Elias Van Ootegem

Reputation: 76395

Ok, I've actually gone ahead and created a test table based on what you've posted here as example data, here's the create statement:

CREATE TABLE `test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_id` int(11) DEFAULT NULL,
  `param_id` int(11) DEFAULT NULL,
  `param_value` varchar(45) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Next, here's how I populated it with the data

INSERT INTO `test`
(
  `user_id`,
  `param_id`,
  `param_value`
)
VALUES
(5,2, 'audi'),
(5,3, 'a4'),
(5,4, 'silver'),
(7,2, 'audi'),
(7,3, 'q3'),
(8,2, 'ford'),
(8,3, 'fiesta');

Then, here are the various queries I've tried:

/** using OR + DISTINCT **/
SELECT DISTINCT user_id
FROM test
WHERE (param_id = 1 AND param_value = 'audi')
   OR (param_id = 3 AND param_value = 'a4')
   OR (param_id = 4 AND param_value = 'silver')

/** using OR + GROUP BY **/
SELECT user_id
FROM test
WHERE (param_id = 1 AND param_value = 'audi')
   OR (param_id = 3 AND param_value = 'a4')
   OR (param_id = 4 AND param_value = 'silver')
GROUP BY user_id


/** Using tripple join **/
SELECT
    t1.user_id
FROM
    test AS t1,
    test AS t2,
    test AS t3
WHERE
    t1.user_id = t2.user_id AND t2.user_id=t3.user_id AND
    t1.param_id = 2 AND t1.param_value = 'audi' AND
    t2.param_id = 3 AND t2.param_value = 'a4' AND
    t3.param_id = 4 AND t3.param_value = 'silver'

/** JOIN with ON clause instead of WHERE **/
SELECT
    t1.user_id
FROM
test AS t1
JOIN
test AS t2
    ON t1.user_id = t2.user_id
JOIN
test AS t3
    ON t3.user_id = t1.user_id
WHERE
    t1.param_id = 2 AND t1.param_value = 'audi' AND
    t2.param_id = 3 AND t2.param_value = 'a4' AND
    t3.param_id = 4 AND t3.param_value = 'silver'

All of these queries produce the same result. However, the way the result is obtained is very different indeed, which shows in the EXPLAIN EXTENDED output, which you can easily see for yourself now by running all of these queries, and prepending EXPLAIN EXTENDED, because I'm just too lazy to paste all of the output here ATM

What you should see is that the JOIN queries all show up as a join-type ALL, meaning the entire table is scanned (3 times over), which the rows count reflects, each JOIN affects all 7 rows (so 21 in total).
Compare that to the alternative, JOIN-less queries, which only have to scan 7 rows, and I think it's fairly likely that joins probably aren't the answer in this case.

Upvotes: 1

Paolo
Paolo

Reputation: 15827

You have to perform multiple joins on the same table.

One way to do it is like this

SELECT
    t1.user_id
FROM
    thetable AS t1,
    thetable AS t2,
    thetable AS t3
WHERE
    t1.user_id = t2.user_id AND t2.user_id=t3.user_id AND
    t1.param_id = 1 AND t1.param_value = 'audi' AND
    t2.param_id = 3 AND t2.param_value = 'a4' AND
    t3.param_id = 4 AND t3.param_value = 'silver'

This is equivalent as using JOIN statements, but it's just to give the idea.

Of course the query have to be extended or reduced depending on how many parameters your're checking (3 in the example case).

If all the fields are indexed the query should run quite fast on 50K records.

Upvotes: 2

Dave
Dave

Reputation: 263

If you want to speed things up you can ADD indexes to the affected columns. But actually no DBM should have a problem with 50k data sets. It's not that much.

Upvotes: 1

Harsh Chunara
Harsh Chunara

Reputation: 585

Please try below where condition in query

where param_id in (1,3) && param_value in ('audi','a4')

Upvotes: 0

Puya Sarmidani
Puya Sarmidani

Reputation: 319

You can't get any values if you're trying it with AND everytime. What it does is he tries to get a user_id with a param_id of 1 and has a value of audi and also a param_id of 3 and also a param_value of a4. You need to try it with OR and i dont see any param_id of 1 so make sure that one exists. If i understand your question correctly you need to do something like

SELECT user_id FROM tablename WHERE 'param_id' = 1 AND 'param_value' = 'audi' OR 'param_id' = 3 AND 'param_value' = 'a4'

Upvotes: 0

Gregor Raýman
Gregor Raýman

Reputation: 3081

Well, the term 'param_id' = 1 AND 'param_value' = 'audi' AND 'param_id' = 3 AND 'param_value' = 'a4' would always be false, so I just assume you want to use (param_id = 1 AND param_value = 'audi') OR (param_id = 3 AND 'param_value' = 'a4')

In that case the query could be like: select distinct user_id where (param_id = 1 AND param_value = 'audi') OR (param_id = 3 AND 'param_value' = 'a4')

Upvotes: 0

Related Questions