Writing faster select queries on large databases

Question

I have two tables, one with about 1,000 rows and one with 700,000 rows; table1 and table2 respectively. I wrote a simple select query:

SELECT DISTINCT name1
FROM table1, table2
WHERE table1.name1 = table2.name2;

The query got me exactly what I want but it took 91 seconds! I tried this sub query out:

SELECT DISTINCT name1
FROM table1
WHERE table1.name1 IN(SELECT DISTINCT name2 FROM table2);

That query took a consistent 37 seconds. So there's some performance boost in the way you write select queries. I wrote a third query:

CREATE TEMPORARY TABLE IF NOT EXISTS t1qry
    (SELECT DISTINCT table1.name1 FROM table1);

CREATE TEMPORARY TABLE IF NOT EXISTS t2qry
    (SELECT DISTINCT table2.name2 FROM table2);

SELECT name2 FROM t2qry JOIN t1qry ON name1 = name2;

DROP TABLE t1qry, t2qry;

This last query took 0.4 seconds to run and produced identical results to the other two.

I knew that each 'select distinct' query took less than a second to run so I was trying to craft one that would find common distinct values between the table. My question is why does what I wrote work? How do I write a faster select query like this without creating temporary tables?

I've been using MySQL and MariaDB but I'll take any SQL related help here. I'm new to SQL and have been trying to learn as much as I can, so I'll take any pointers or info about this.

Writing faster select queries on large databases

Answers (1)

Related Questions