jarjar
jarjar

Reputation: 29

Optimize query?

My query took 28.39 seconds to run. How can I optimize it?

explain SELECT distinct UNIX_TIMESTAMP(timestamp)*1000 as timestamp,count(a.sig_name) as counter from event a,network n  where n.fsi='pays' and n.net=inet_ntoa(a.ip_src) group by date(timestamp) order by timestamp asc;
+----+-------------+-------+--------+---------------+---------+---------+--- ---+---------+---------------------------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref  | rows    | Extra                           |
+----+-------------+-------+--------+---------------+---------+---------+------+---------+---------------------------------+
|  1 | SIMPLE      | a     | ALL    | NULL          | NULL    | NULL    | NULL | 8177074 | Using temporary; Using filesort |
|  1 | SIMPLE      | n     | eq_ref | PRIMARY,fsi   | PRIMARY | 77      | func |       1 | Using where                     |
+----+-------------+-------+--------+---------------+---------+---------+------+---------+---------------------------------+

Upvotes: 0

Views: 100

Answers (2)

Imran Zahoor
Imran Zahoor

Reputation: 2787

Make it a practice to introduce at-least index on columns which can be used in WHERE/JOIN clauses. I've used the at-least because in many cases one should try to use PRIMARY/FOREIGN KEY relations. So if something is already a primary/foriegn key there is no need to index it further.

The above query can be simply improved by introducing the INDEX through the following query:

ALTER TABLE events ADD INDEX idx_ev_ipsrc (ip_src);

Here idx_ev_ipsrc = Name of the index key, and ip_src is the column to be indexed.

Even further enhancement:

Introduce multi-colum index on network table using following query:

ALTER TABLE network ADD INDEX idx_net_fsi_net (fsi,net);

The above will result in even low number of rows.

Note: The above queries are for MySql and can be tailored for other DBs easily.

Upvotes: 1

Trent Lloyd
Trent Lloyd

Reputation: 1892

So generally looking at your query, we find that table event a is examining 8,177,074 rows. That is likely the "root" of the slowness, so we want to look at how to reduce the search space using indexes.

The main condition on event a is

n.net=inet_ntoa(a.ip_src)

The problem here is that we need to perform a calculation (inet_ntoa) on every row of a.ip_src, so there is no alternative but to scan the entire table. A potentially better solution would be to invert the comparison and ensure that a.ip_src is indexed.

a.ip_src=inet_aton(n.net)

This will only be better if we are matching less rows in n than we are in a. If that is not the case, you should seriously consider caching the result of this function in the table and creating an index on that.

Lastly I am guessing the timestamp column is in event a, in which case an index will potentially help with ordering and grouping though may not. You could try a multi_column index on (ip_src,timestamp)

Upvotes: 3

Related Questions