Reputation: 1275
I have two tables, one with about 1,000 rows and one with 700,000 rows; table1 and table2 respectively. I wrote a simple select query:
SELECT DISTINCT name1
FROM table1, table2
WHERE table1.name1 = table2.name2;
The query got me exactly what I want but it took 91 seconds! I tried this sub query out:
SELECT DISTINCT name1
FROM table1
WHERE table1.name1 IN(SELECT DISTINCT name2 FROM table2);
That query took a consistent 37 seconds. So there's some performance boost in the way you write select queries. I wrote a third query:
CREATE TEMPORARY TABLE IF NOT EXISTS t1qry
(SELECT DISTINCT table1.name1 FROM table1);
CREATE TEMPORARY TABLE IF NOT EXISTS t2qry
(SELECT DISTINCT table2.name2 FROM table2);
SELECT name2 FROM t2qry JOIN t1qry ON name1 = name2;
DROP TABLE t1qry, t2qry;
This last query took 0.4 seconds to run and produced identical results to the other two.
I knew that each 'select distinct' query took less than a second to run so I was trying to craft one that would find common distinct values between the table. My question is why does what I wrote work? How do I write a faster select query like this without creating temporary tables?
I've been using MySQL and MariaDB but I'll take any SQL related help here. I'm new to SQL and have been trying to learn as much as I can, so I'll take any pointers or info about this.
Upvotes: 0
Views: 519
Reputation: 1269883
If your latter version (including creating the temporary tables) goes so fast, then you probably have indexes on nameX
in both tables.
I would suggests using exists
:
SELECT DISTINCT name1
FROM table1
WHERE EXISTS (SELECT 1 FROM table2 WHERE table1.name1 = table2.name2);
For these queries, you do want indexes on table1(name1)
and table2(name2)
.
And, if table1
has no duplicates, then leave out the DISTINCT
.
Upvotes: 1