Reputation: 295
i have a database table containing events.
mysql> describe events;
+-------------+------------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------------------+----------------+
| device | varchar(32) | YES | MUL | NULL | |
| psu | varchar(32) | YES | MUL | NULL | |
| event | varchar(32) | YES | MUL | NULL | |
| down_time | timestamp | NO | MUL | CURRENT_TIMESTAMP | |
| up_time | timestamp | NO | MUL | 0000-00-00 00:00:00 | |
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
+-------------+------------------+------+-----+---------------------+----------------+
6 rows in set (0.01 sec)
i want to find events that overlap in time and use the following query:
SELECT *
FROM link_events a
JOIN link_events b
ON ( a.down_time <= b.up_time )
AND ( a.up_time >= b.down_time )
WHERE (a.device = 'd1' AND b.device = 'd2')
AND (a.psu = 'p1' AND b.psu = 'p2')
AND (a.event = 'e1' AND b.event = 'e2');
+-------------+-----------+------------+---------------------+---------------------+--------+-------------+-----------+------------+---------------------+---------------------+--------+
| device | psu | event | down_time | up_time | id | device | psu | event | down_time | up_time | id |
+-------------+-----------+------------+---------------------+---------------------+--------+-------------+-----------+------------+---------------------+---------------------+--------+
| d1 | p1 | e1 | 2013-01-14 16:42:10 | 2013-01-14 16:43:00 | 374529 | d2 | p2 | e2 | 2013-01-14 16:42:14 | 2013-01-14 16:42:18 | 211570 |
| d1 | p1 | e1 | 2013-05-29 18:49:26 | 2013-05-30 12:31:15 | 374569 | d2 | p2 | e2 | 2013-05-30 08:48:20 | 2013-05-30 08:48:27 | 211787 |
| d1 | p1 | e1 | 2013-05-29 18:49:26 | 2013-05-30 12:31:15 | 374569 | d2 | p2 | e2 | 2013-05-30 08:48:54 | 2013-05-30 08:48:58 | 211788 |
+-------------+-----------+------------+---------------------+---------------------+--------+-------------+-----------+------------+---------------------+---------------------+--------+
3 rows in set (35.88 sec)
The events table contains the following number of rows:
mysql> select count(*) from events;
+----------+
| count(*) |
+----------+
| 977759 |
+----------+
1 row in set (0.01 sec)
mysql> select count(*) from events where device = 'd1' and psu = 'p1' and event = 'e1';
+----------+
| count(*) |
+----------+
| 11397 |
+----------+
1 row in set (0.12 sec)
mysql> select count(*) from events where device = 'd2' and psu = 'p2' and event = 'e2';
+----------+
| count(*) |
+----------+
| 243 |
+----------+
1 row in set (0.00 sec)
The database is installed on Windows 7 laptop and uses MyISAM engine. Is there a way to better organise the database or change indexing to improve query time which for first query is 35 secs. Repeating the same query gives an immediate result however if i 'flush tables' and repeat query a third time the time taken is again 35 secs. Any help appreciated !
Here is output from EXPLAIN after ADD KEY:
mysql> EXPLAIN
-> SELECT *
->
-> FROM link_events a
-> JOIN link_events b
->
-> ON ( a.down_time <= b.up_time )
-> AND ( a.up_time >= b.down_time )
->
-> WHERE (a.device = 'd1' AND b.device = 'd2')
-> AND (a.psu = 'l1' AND b.psu = 'l2')
-> AND (a.event = 'e1' AND b.event = 'e2');
+----+-------------+-------+------+--------------------------------------------------------------------------------+---------------+---------+-------------------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+--------------------------------------------------------------------------------+---------------+---------+-------------------+------+-----------------------+
| 1 | SIMPLE | b | ref | device,psu,event,down_time,up_time,device_2,device_3 | device_2 | 297 | const,const,const | 180 | Using index condition |
| 1 | SIMPLE | a | ref | device,psu,event,down_time,up_time,device_2,device_3 | device_2 | 297 | const,const,const | 7744 | Using index condition |
+----+-------------+-------+------+--------------------------------------------------------------------------------+---------------+---------+-------------------+------+-----------------------+
2 rows in set (0.07 sec)
New column:
mysql> describe link_events;
+-------------+------------------+------+-----+---------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+------------------+------+-----+---------------------+-----------------------------+
| device_name | varchar(32) | YES | MUL | NULL | |
| link_name | varchar(32) | YES | MUL | NULL | |
| event_type | varchar(32) | YES | MUL | NULL | |
| down_time | timestamp | NO | MUL | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| up_time | timestamp | NO | MUL | 0000-00-00 00:00:00 | |
| span | geometry | NO | MUL | NULL | |
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
+-------------+------------------+------+-----+---------------------+-----------------------------+
7 rows in set (0.03 sec)
EXPLAIN:
mysql> EXPLAIN
->
-> SELECT
->
-> CONCAT('Link1','-', 'Link2') overlaps,
-> GREATEST(a.down_time,b.down_time) AS downtime,
-> LEAST(a.up_time,b.up_time) AS uptime,
-> TIME_TO_SEC(TIMEDIFF( LEAST(a.up_time,b.up_time),
-> GREATEST(a.down_time,b.down_time))) AS duration
->
-> FROM link_events a
-> JOIN link_events b
->
-> ON Intersects (a.span, b.span)
->
-> WHERE (a.device_name = 'd1' AND b.device_name = 'd2')
-> AND (a.link_name = 'l1' AND b.link_name = 'l2')
-> AND (a.event_type = 'e1' AND b.event_type = 'e1');
+----+-------------+-------+------+-------------------------------------------------------------------+---------------+---------+-------------------+-------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------------------------------------------------------+---------------+---------+-------------------+-------+------------------------------------+
| 1 | SIMPLE | a | ref | span,device_name,link_name,event_type,device_name_2,device_name_3 | device_name_2 | 297 | const,const,const | 383 | Using index condition |
| 1 | SIMPLE | b | ref | span,device_name,link_name,event_type,device_name_2,device_name_3 | device_name_2 | 297 | const,const,const | 14580 | Using index condition; Using where |
+----+-------------+-------+------+-------------------------------------------------------------------+---------------+---------+-------------------+-------+------------------------------------+
2 rows in set (0.09 sec)
Using Intersects takes 1min 12 secs?
Upvotes: 1
Views: 73
Reputation: 5336
Try:
ALTER TABLE link_events ADD KEY(device,psu,event,up_time),
ADD KEY(device,psu,event,down_time)
Hopefully this will be selective enough. If this does not help, post the results of EXPLAIN
so we can make sure the optimizer is doing the best it can, and we will go from there if needed.
Edit:
It is important to understand that not all indexes are of equal value for a particular query. A common mistake is to think of an index as some magic worker that will automatically speed up the query if you just reference the column in the index. This is not quite the case. The keys need to be designed and the queries needs to be written in such a way that allows the best possible access path to the records. Changing something that might appear insignificant such as the order of the columns in the index or writing SQRT(x) = 4.4
instead of x = 4.4 * 4.4
could make the index unusable and slow the query down by a factor of a thousand or even a million or more.
I highly recommend reading this:
http://dev.mysql.com/doc/refman/5.7/en/mysql-indexes.html
Having a feel for how MySQL uses keys can save you a lot of trouble in the future.
EDIT 2 - another idea is to add a column span GEOMETRY NOT NULL, SPATIAL KEY (span)
containing linestring(point(up_time,0),point(down_time,0))
- times would need to be numeric (you can convert using UNIX_TIMESTAMP()
for example) - and use Intersects(a.span,b.span)
in the query. With some fine tuning this has the potential of being much faster than even the improved query because span intersections are being detected using a geometry-based algorithm specially designed for such things.
Upvotes: 0
Reputation: 1271003
For this query:
SELECT *
FROM link_events a JOIN
link_events b
ON (a.down_time <= b.up_time) AND (a.up_time >= b.down_time)
WHERE (a.device = 'd1' AND b.device = 'd2') AND
(a.psu = 'p1' AND b.psu = 'p2') AND
(a.event = 'e1' AND b.event = 'e2');
You want indexes on link_events(device, psu, event, up_time, down_time)
. For clarity, I would express the query more like this:
SELECT *
FROM link_events a JOIN
link_events b
ON (a.down_time <= b.up_time) AND (a.up_time >= b.down_time)
WHERE (a.device, a.psu, a.event) IN (('d1', 'p1', 'e1')) AND
(b.device, a.psu, a.event) IN (('d2', 'p2', 'e2'));
Upvotes: 4