Oto Shavadze
Oto Shavadze

Reputation: 42863

2 different small query vs 1 query with subquery

I have table like this

name       | personal_number 
-----------------------------------------
Jon        | 222
Alex       | 555
Jon        | 222
Jimmy      | 999

I need get every name, which personal_number repeates in table more than 1, that is result must be:

 Jon        
 Jon        

So, Variant 1):

SELECT  name  FROM mytable WHERE personal_number IN (
        SELECT  personal_number  FROM mytable  GROUP BY personal_number
        HAVING COUNT(*) > 1
)

Variant 2):

SELECT  personal_number  FROM mytable  GROUP BY personal_number
        HAVING COUNT(*) > 1
)

Then, using php, retrieved personal_numbers join as string (soemthing like this '222', '222' ) and run other query

SELECT  name FROM mytable  WHERE personal_number IN( here joined string )

Variant 2 works approximately 10 times faster, than variant 1, this is surprise for me, I was thinking that one query will be faster, but...

(In table is 500 000 rows, column personal_number not indexed)

So, what you mean about cases like this? why variant 2 is many faster than variant 1 ?

Upvotes: 17

Views: 5096

Answers (4)

Michael Sivolobov
Michael Sivolobov

Reputation: 13340

First query has heavy subquery. You must avoid this. The best solution for your problem is only one query:

SELECT name FROM mytable GROUP BY personal_number HAVING COUNT(*) > 1;

This query will return you each repeated name only once. If you want to display the name of the duplicate as many times as they met you must use next query:

SELECT name, COUNT(*) AS count FROM mytable GROUP BY personal_number HAVING COUNT(*) > 1;

And then in PHP do something like this:

foreach ($rows as $row) {
  for ($i = 0; $i++; $i < $row['count']) {
    echo $row['name'] . "\n";
  }
}

Upvotes: 1

Bart
Bart

Reputation: 17371

It seems that subqueries are very slow as mentioned in this article http://www.mysqlperformanceblog.com/2010/10/25/mysql-limitations-part-3-subqueries.

You should try to avoid having subqueries and use joining instead.

Upvotes: 5

palindrom
palindrom

Reputation: 19111

This should be quicker:

SELECT  name  FROM mytable join (
        SELECT  personal_number  FROM mytable  GROUP BY personal_number
        HAVING COUNT(*) > 1
)a using (personel_number)

Edit: If this is faster than variant 1, then it means at variant 1 mysql reproduces the inner table for each record again and again.

Upvotes: 0

Anvesh
Anvesh

Reputation: 101

Since indexing is not done so the 1 is slow,as it has to match personal_numbers from selected personal_numbers. If indexing is done it consumes less time than earlier. Variant 2 is a direct query hence its faster.

Upvotes: 0

Related Questions