Reputation: 322
SELECT *
FROM grants
INNER JOIN people on grants.volID=people.vol_id
INNER JOIN org on grants.orgID=org.orgid
order by yearStart DESC
I have this ^ join which runs just great on its own. Once I open the row results and start to loop through it I run a second query that does a count and date info from another table:
SELECT COUNT(Distinct Event_ID) as ME, MAX(Sample_Date) as MaxD
FROM results where orgid=%d
I needed data from the first pull to get the ordID which is why I am going through them one at a time
so it runs like this
Query 1
while($row = mysql_fetch_assoc($result)){
Query 2
while($row1 = mysql_fetch_assoc($result1)){
get some data from 2
} //close 2
get some data from 1 and merge with 2
} //close 1
It runs through about 230 records in orgs very fast if the secondary query isn't pushed inside of it. It slows to close to 20 seconds! Have I not built the Count Distinct correctly? The results table is about 100,000 records but I move through that thing with other queries and it doesnt bog like this! How do I subquery this if that would help?
Thanks for any insight.
Upvotes: 0
Views: 141
Reputation: 4879
The first thing you should do to figure out performance bottlenecks in your queries is use the EXPLAIN feature of your database so it can tell you what it is doing. https://dev.mysql.com/doc/refman/5.0/en/explain.html
It sounds like you may not have some indexes properly set up, causing unnecessary rows to be scanned every time you loop through the results of the first join query. A way to check is like this example below:
First I have a test table
mysql> desc test_table;
+-------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(64) | YES | | NULL | |
| description | text | YES | | NULL | |
| published | datetime | YES | | NULL | |
| updated | datetime | YES | | NULL | |
| status | tinyint(1) | YES | | NULL | |
+-------------+-------------+------+-----+---------+----------------+
6 rows in set (0.02 sec)
mysql> show indexes from test_table;
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| test_table | 0 | PRIMARY | 1 | id | A | 0 | NULL | NULL | | BTREE | | |
+------------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
1 row in set (0.01 sec)
mysql> select count(1) from test_table;
+----------+
| count(1) |
+----------+
| 0 |
+----------+
1 row in set (0.02 sec)
Next I add a few rows
mysql> INSERT INTO test_table (name, description, published, status) VALUES ('name1','description 1 goes here',now(),1),('name2','description 2 goes here',now(),1),('name3', 'description 3 goes here', now(),1);
Query OK, 3 rows affected (0.02 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select name, description from test_table where status = 1;
+-------+-------------------------+
| name | description |
+-------+-------------------------+
| name1 | description 1 goes here |
| name2 | description 2 goes here |
| name3 | description 3 goes here |
+-------+-------------------------+
3 rows in set (0.01 sec)
Next I use the EXPLAIN feature in the database to analyze my query
mysql> EXPLAIN SELECT name, description, status FROM test_table WHERE name = 'name1' AND status = 1;
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | test_table | ALL | NULL | NULL | NULL | NULL | 3 | Using where |
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
You can see that it is scanning 3 rows to find the record. I suspect your database is scanning all 100K rows for the second query, every row you iterate through. That means if 100 results in first query, you have 10 million row scans (100 * 100K). You want the rows column as close to 1 as possible meaning it will use the index to lookup the row which is much faster.
I now create an index and include columns I expect to be in my WHERE clause (in the order I will add them noting not all need to be used every time)
mysql> CREATE INDEX idx_so_example ON test_table (name, description (255), status);
Query OK, 0 rows affected (0.04 sec)
Records: 0 Duplicates: 0 Warnings: 0
Next I try that EXPLAIN again and see now how the database is using the index and only 1 Row was scanned. You should optimize your indexes to achieve similar result.
mysql> EXPLAIN SELECT name, description, status FROM test_table WHERE name = 'name1' AND status = 1;
+----+-------------+------------+------+----------------+----------------+---------+-------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+----------------+----------------+---------+-------+------+-----------------------+
| 1 | SIMPLE | test_table | ref | idx_so_example | idx_so_example | 195 | const | 1 | Using index condition |
+----+-------------+------------+------+----------------+----------------+---------+-------+------+-----------------------+
1 row in set (0.01 sec)
For your database I would add a compound index on those 3 columns in your second query, assuming that 'results' is an actual table name based on your question.
CREATE INDEX idx_some_name ON results (Event_ID, Sample_Date, orgid);
Also a suggestion: your naming convention should be consistent for fields or you make your database a nightmare to memorize and code for. Pick one standard and stick with it so if using EventId, SampleDate, OrgId great or event_id, sample_date, org_id but standardize all column names and convention so less syntax errors in your code later trying to query the data.
Upvotes: 1
Reputation: 7663
Assuming results.orgid is indexed to rule out that problem...
Things usually go much better if you use JOINs so that MySQL can optimize. Sub-queries can have bad performance.
Try this out if I understand your relationships correctly:
SELECT grants.*, org.*, COUNT(Distinct Event_ID) as ME, MAX(Sample_Date) as MaxD
FROM grants
INNER JOIN people on grants.volID=people.vol_id
INNER JOIN org on grants.orgID=org.orgid
LEFT JOIN results ON results.orgid=org.orgid
GROUP BY grants.grantid #whatever your grants PK is
ORDER BY yearStart DESC
Don't forget to replace grants.grantid
with your actual grants PK column.
Upvotes: 1