Reputation: 138
This is more of an academic question, because in my particular case I can create an easy workaround, but I would like to understand the reason behind this anyway.
Using an InnoDB table (MariaDB 10.0.31) with (among others) columns customer and uri, I wanted to select the distinct uris for a specific customer. Now, the table is quite large (around 50M entries), so there is a composite index on customer and uri.
Basically what I don't understand is why the order of the columns in the group by clause matters.
explain select customer, uri from `tableName` group by customer,uri;
tells me it will use the existing index for group by, but
explain select customer, uri from `tableName` group by uri,customer;
won't do so.
Could someone explain why this is the case? I always thought of the group by clause as declarative.
Maybe it's because it's Friday, but I can't think of a case, where the order of the group by columns would affect the result.
Upvotes: 4
Views: 2945
Reputation: 142433
Write a feature request at bugs.mysql.com .
On the one hand, GROUP BY
is (or was) defined to imply ORDER BY
with the same columns in the same order.
On the other hand, if you ignore that non-standard feature, even by saying ORDER BY NULL
, MySQL fails to shuffle the columns in order to use the index.
5.7 (and before) says
GROUP BY implicitly sorts by default (that is, in the absence of ASC or DESC designators), but relying on implicit GROUP BY sorting is deprecated. To produce a given sort order, use explicit ASC or DESC designators for GROUP BY columns or provide an ORDER BY clause. GROUP BY sorting is a MySQL extension that may change in a future release; for example, to make it possible for the optimizer to order groupings in whatever manner it deems most efficient and to avoid the sorting overhead.
and
If a query includes GROUP BY but you want to avoid the overhead of sorting the result, you can suppress sorting by specifying ORDER BY NULL.
But, watch out; 8.0 says
Previously, relying on implicit GROUP BY sorting was deprecated but GROUP BY did sort by default (that is, in the absence of ASC or DESC designators). In MySQL 8.0, GROUP BY no longer sorts by default, so query results may differ from previous MySQL versions. To produce a given sort order, use explicit ASC or DESC designators for GROUP BY columns or provide an ORDER BY clause.
Upvotes: 0
Reputation: 2513
Your observation is correct. Results would be different as the "prefix" order of columns mentioned in the composite index declaration is used for decision making by the Cost based optimizer. This behavior is due to the usage of B-TREE index
GROUP BY clause is used for ordering the result and hence if
More on this and topic of Loose/Tight Index Scan can be found here https://dev.mysql.com/doc/refman/5.7/en/group-by-optimization.html
Upvotes: 3
Reputation: 8746
In index is basically an ordered table. In you case it is ordered according like ORDER BY customer, uri
(because this is how your index is defined).
MySQL executes group by
by first ordering the result according to the group by
clause and then collapsing the rows with the same values (that happen to follow each other after sorting).
Apparently, MySQL is not smart enough to recognize that the different group by
clause could also be executed when the result is ordered the other way.
More about this:
Upvotes: 0