Reputation: 25872
I need to use InnoDB storage engine on a table with about 1mil or so records in it at any given time. It has records being inserted to it at a very fast rate, which are then dropped within a few days, maybe a week. The ping table has about a million rows, whereas the website table only about 10,000.
My statement is this:
select url
from website ws, ping pi
where ws.idproxy = pi.idproxy and pi.entrytime > curdate() - 3 and contentping+tcpping is not null
group by url
having sum(contentping+tcpping)/(count(*)-count(errortype)) < 500 and count(*) > 3 and
count(errortype)/count(*) < .15
order by sum(contentping+tcpping)/(count(*)-count(errortype)) asc;
I added an index on entrytime, yet no dice. Can anyone throw me a bone as to what I should consider to look into for basic optimization of this query. The result set is only like 200 rows, so I'm not getting killed there.
Upvotes: 0
Views: 319
Reputation: 32309
In the absence of the schemas of the relations, I'll have to make some guesses.
If you're making WHERE a.attrname = b.attrname
clauses, that cries out for a JOIN
instead.
Using COUNT(*)
is both redundant and sometimes less efficient than COUNT(some_specific_attribute)
. The primary key is a good candidate.
Why would you test contentping+tcpping IS NOT NULL
, asking for a calculation that appears unnecessary, instead of just testing whether the attributes individually are null?
Here's my attempt at an improvement:
SELECT url
FROM website AS ws
JOIN ping AS pi
ON ws.idproxy = pi.idproxy
WHERE
pi.entrytime > CURDATE() - 3
AND pi.contentping IS NOT NULL
AND pi.tcpping IS NOT NULL
GROUP BY url
HAVING
SUM(pi.contentping + pi.tcpping) / (COUNT(pi.idproxy) - COUNT(pi.errortype)) < 500
AND COUNT(pi.idproxy) > 3
AND COUNT(pi.errortype) / COUNT(pi.idproxy) < 0.15
ORDER BY
SUM(pi.contentping + pi.tcpping) / (COUNT(pi.idproxy) - COUNT(pi.errortype)) ASC;
Performing lots of identical calculations in both the HAVING and ORDER BY clauses will likely be costing you performance. You could either put them in the SELECT clause, or create a view that has those calculations as attributes and use that view for accessing the values.
Upvotes: 2