alex stacey
alex stacey

Reputation: 447

Inconsistency between MariaDB and MySQL

I've found a different behaviour in how a query is interpreted between these 2 databases and wondered if anyone could shed any light on what is happening here. The query looks like this:

SELECT t1.id, t2.album_id
FROM t1
    LEFT OUTER JOIN t2
        ON t1.data_id = t2.id
        AND t1.event_type IN (1002, 1001, 1000)
WHERE
    t1.event_type IN (1000, 1001, 1002, 1200, 1201, 1202, 1203)
GROUP BY t1.id
ORDER BY t1.id DESC
LIMIT 0, 20;

The MariaDB result looks like this:

+-----+----------+
| id  | album_id |
+-----+----------+
| 623 |     NULL |
| 622 |     NULL |
| 621 |     NULL |
| 620 |     NULL |
| 619 |     NULL |
| 618 |     NULL |
| 617 |     NULL |
| 616 |     NULL |
| 615 |     NULL |
| 614 |     NULL |
| 613 |     NULL |
| 612 |      194 |
| 611 |     NULL |
| 610 |     NULL |
| 609 |     NULL |
| 608 |      193 |
| 607 |     NULL |
| 606 |     NULL |
| 605 |     NULL |
| 604 |     NULL |
+-----+----------+

And the Oracle MySQL result looks like this:

+-----+----------+
| id  | album_id |
+-----+----------+
| 623 |     NULL |
| 622 |     NULL |
| 621 |     NULL |
| 620 |     NULL |
| 619 |     NULL |
| 618 |     NULL |
| 617 |     NULL |
| 616 |      196 |<-- different
| 615 |     NULL |
| 614 |     NULL |
| 613 |     NULL |
| 612 |      194 |
| 611 |      194 |<-- different
| 610 |     NULL |
| 609 |     NULL |
| 608 |      193 |
| 607 |      193 |<-- different
| 606 |     NULL |
| 605 |     NULL |
| 604 |     NULL |
+-----+----------+

Additionally, when I EXPLAIN the queries, I can see that the two databases are interpreting the query differently. (See the "Extra" column)

MariaDB

+------+-------------+-------+--------+---------------+---------+---------+------------------------+------+-------------+
| id   | select_type | table | type   | possible_keys | key     | key_len | ref                    | rows | Extra       |
+------+-------------+-------+--------+---------------+---------+---------+------------------------+------+-------------+
|    1 | SIMPLE      | t1    | index  | NULL          | PRIMARY | 4       | NULL                   |   20 | Using where |
|    1 | SIMPLE      | t2    | eq_ref | PRIMARY       | PRIMARY | 4       | foo.t1.data_id         |    1 | Using where |
+------+-------------+-------+--------+---------------+---------+---------+------------------------+------+-------------+

Oracle MySQL

+----+-------------+-------+--------+---------------+---------+---------+---------------------------+------+-------------+
| id | select_type | table | type   | possible_keys | key     | key_len | ref                       | rows | Extra       |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------+------+-------------+
|  1 | SIMPLE      | t1    | index  | NULL          | PRIMARY | 4       | NULL                      |   20 | Using where |
|  1 | SIMPLE      | t2    | eq_ref | PRIMARY       | PRIMARY | 4       | foo.t1.data_id            |    1 |             |
+----+-------------+-------+--------+---------------+---------+---------+---------------------------+------+-------------+

I have found workarounds for this but would really like to know what is going on here. Does anyone have any ideas?

If you want to try it yourself, a dump of the data I used in this example can be found here.

Thanks.

edit: It have been pointed out in the comments that the query is invalid SQL in most databases but that MySQL allows it - but that the database is free to return any aggregated value from the GROUP BY. I'd just like to point out that what appears to be happening here is different, because the values are not ambiguous. There is only a single matching row, but that does not correspond to the value that MariaDB is returning.

SELECT t1.id, t2.album_id
FROM t1
    JOIN t2
        ON t1.data_id = t2.id
WHERE
    t1.id = 616
; 

+-----+----------+
| id  | album_id |
+-----+----------+
| 616 |      196 |
+-----+----------+
1 row in set (0.00 sec) 

Upvotes: 2

Views: 607

Answers (2)

alex stacey
alex stacey

Reputation: 447

It turns out this is actually a bug in MariaDB which can produce the wrong results when using group by and a left join on 2 conditions.

Upvotes: 2

krokodilko
krokodilko

Reputation: 36107

This query is using a so caled MySql extension to GROUP BY
See this link for details: http://dev.mysql.com/doc/refman/5.7/en/group-by-extensions.html

They clearly said that:

MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate.

Taking the above into account, this behaviour is with accordance with specification.

Upvotes: 1

Related Questions