Reputation: 417
Sorry for lots of useless text. Most important stuff is told on last 3 paragraphs :D
Recently we had some mysql problem in one of client servers. Something out of blue starts sky-rocking CPU of mysql process. This problem lead us to finding and optimizing bad queries and here is a problem.
I was thinking that optimization is speeding up queries (total time needed for a query to execute). But after optimizing several queries towards it my colleague starting colleague started complaining that some queries read too many rows, all rows from table (as shown with EXPLAIN).
After rewriting a query I noticed that, if I want a query to read less rows - query speed suffers, if I query is made for speed - more rows are read.
And that didn't make me a sense: less rows read, but execution time is longer
And that made me wonder what should be done. Of course it would be perfect to have fast query which reads least rows. But since it doesn't seem to be possible for me, I'm searching for some answers. Which approach should I take - speed or less rows read? What are pros&cons when query is fast but with more rows read and when less rows are read with speed suffer? What happens with server at different cases?
After googling all I could find was articles and discussions about how to improve speed, but neither covered those different cases I mentioned before.
I'm looking forward to seeing even personal choices of course with some reasoning. Links which could direct me right way are welcome too.
Upvotes: 0
Views: 2134
Reputation: 562891
The count of rows read reported by EXPLAIN is an estimate anyway -- don't take it as a literal value. Notice that if you run EXPLAIN on the same query multiple times, the number of rows read changes each time. This estimate can even be totally inaccurate, as there have been bugs in EXPLAIN from time to time.
Another way to measure query performance is SHOW SESSION STATUS LIKE 'Handler%'
as you test the query. This will tell you accurate counts of how many times the SQL layer made requests for individual rows to the storage engine layer. For examples, see my presentation, SQL Query Patterns, Optimized.
There's also an issue of whether the rows requested were already in the buffer pool (I'm assuming you use InnoDB), or did the query have to read them from disk, incurring I/O operations. A small number of rows read from disk can be orders of magnitude slower than a large number of rows read from RAM. This doesn't necessarily account for your case, but it points out that such a scenario can occur, and "rows read" doesn't tell you if the query caused I/O or not. There might even be multiple I/O operations for a single row, because of InnoDB's multi-versioning.
Insight into the difference between logical row request vs. physical I/O reads is harder to get. In Percona Server, enhancements to the slow query log include the count of InnoDB I/O operations per query.
Upvotes: 3
Reputation: 21057
Without more data, I can give you some hints:
group by
only to deduplicate rows, you are wasting resources; it is better to use select distinct
(on an indexed field).Upvotes: 2
Reputation: 4102
I think your problem depends on how you are limiting the amount of rows read. If you read less rows by implementing more WHERE clauses that MySQL needs to run against, then yes, performance will take a hit.
I would look at perhaps indexing some of your columns that make your search more complex. Simple data types are faster to lookup than complex ones. See if you are searching toward indexed columns.
Upvotes: 2