Reputation: 21116
I have the following query:
SELECT
`tests`.`id`,
`tests`.`created_at`,
`tests`.`updated_at`,
`tests`.`created_by`,
`tests`.`date_of_test`,
`tests`.`location`,
`tests`.`information`,
`tests`.`title`,
`tests`.`goals`,
`tests`.`deleted_at`,
`tests`.`status`,
`tests`.`tester`,
`tests`.`test_approach`
FROM `tests`
WHERE
`tests`.`id` IN (
SELECT `test_wobble`.`test_id`
FROM `project_wobble`
INNER JOIN `wobbles` ON `project_wobble`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profiles` ON `wobble_profiles`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profile_user` ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id`
INNER JOIN `test_wobble` ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
GROUP BY `wobbles`.`id`
)
GROUP BY `tests`.`id`
ORDER BY tests.date_of_test DESC
If I run the IN
query on its own, it returns
"test_id"
When i run the whole above query, I get
If I replace the IN
query with the number 13... The SQL returns 1 result (The correct one).
What am i doing wrong here?
Upvotes: 2
Views: 677
Reputation: 72177
The inner query exposes undefined behaviour. It is explained in the documentation on the page MySQL Handling of GROUP BY.
According to the SQL standard, the inner query is invalid. To be valid, all the columns that appear in the SELECT
, HAVING
and ORDER BY
clauses must satisfy one of the following:
GROUP BY
clause;SELECT
, HAVING
or ORDER BY
) only as parameters of aggregate functions;GROUP BY
columns.For example, using your tables, you can put in the SELECT
clause:
wobbles.id
- because it appears in the GROUP BY
clause;COUNT(DISTINCT project_wobble.project_id)
- even if project_wobble.project_id
does not appear in GROUP BY
, it can be used as a parameter of the aggregate function COUNT(DISTINCT)
;wobbles
, given that the column id
is its PK
- all the columns of table wobbles
are functionally dependent on wobbles.id
(their values are uniquely determined by the value of wobbles.id
).Before version 5.7.5, MySQL accepts queries that do not follow the above requirements but, as the documentation states:
In this case, the server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate, which is probably not what you want.
Starting with version 5.7.5, MySQL implements detection of functional dependence as an configurable feature (which is turned ON by default).
On 5.7.5 your inner query will trigger an error and that's all; your query is invalid, so it doesn't run at all.
On previous versions (also on 5.7.5 if the ONLY_FULL_GROUP_BY
SQL mode is disabled), the query runs but its results are unpredictable. They can change from one execution to the next if, for example, a row is deleted then re-inserted.
Because the MySQL query optimizer re-organizes your whole query for better execution plan, when it is embedded in the larger query its execution is not the same as when it is ran standalone. This is another way you can observe its undefined behaviour.
Extract the inner query, remove the GROUP BY
clause, add more columns to the SELECT
clause and look at what it produces:
SELECT DISTINCT `test_wobble`.`wobble_id`, `test_wobble`.`test_id`
FROM `project_wobble`
INNER JOIN `wobbles` ON `project_wobble`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profiles` ON `wobble_profiles`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profile_user` ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id`
INNER JOIN `test_wobble` ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
If I'm not wrong, it will produce two rows having the same wobble_id
and values 13
and 14
for column test_id
.
If this result set is correct then you can remove test_wobble.wobble_id
from SELECT
, keep DISTINCT
and put the query into the larger one.
There is no need for GROUP BY
(because of the DISTINCT
) and it should work faster without it.
Upvotes: 0
Reputation: 425251
This query:
SELECT `test_wobble`.`test_id`
FROM `project_wobble`
INNER JOIN `wobbles` ON `project_wobble`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profiles` ON `wobble_profiles`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profile_user` ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id`
INNER JOIN `test_wobble` ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
GROUP BY `wobbles`.`id`
groups by wobbles.id
but returns test_wobble.test_id
which is not a part of GROUP BY
.
On each iteration, MySQL
pushes the IN field into this query:
SELECT `test_wobble`.`test_id`
FROM `project_wobble`
INNER JOIN `wobbles` ON `project_wobble`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profiles` ON `wobble_profiles`.`wobble_id` = `wobbles`.`id`
INNER JOIN `wobble_profile_user` ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id`
INNER JOIN `test_wobble` ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
-- This is implicitly added by MySQL when optimizing
AND `test_wobble`.`test_id` = `tests`.`id`
GROUP BY `wobbles`.`id`
and then just checks if some value exists.
If you remove the GROUP BY
from your IN
query, you'll see that it contains both 13
and 14
, but only one of those is returned when you run the query with GROUP BY
.
You can also try running the second query, substituting 13 and 14 instead of tests.id
and make sure the query returns something in both cases.
This might actually be considered a bug in MySQL
. However, since the documentation does not specify which ungrouped and unaggregated expression will be returned from a grouped query, it's better to specify it explicitly, of side effects from the optimizer will kick in like the do in this case.
Could you please provide some sample of your data and outline what are you going to achieve with the query?
Upvotes: 1
Reputation: 1269445
It is a little bit hard to tell without knowing what the data is. But, you do have an issue in the subquery. This is your subquery:
SELECT `test_wobble`.`test_id`
FROM `project_wobble` INNER JOIN
`wobbles`
ON `project_wobble`.`wobble_id` = `wobbles`.`id` INNER JOIN
`wobble_profiles`
ON `wobble_profiles`.`wobble_id` = `wobbles`.`id` INNER JOIN
`wobble_profile_user`
ON `wobble_profile_user`.`wobble_profile_id` = `wobble_profiles`.`id` INNER JOIN
`test_wobble`
ON `test_wobble`.`wobble_id` = `wobbles`.`id`
WHERE `project_wobble`.`project_id` = '2' AND `wobble_profile_user`.`user_id` = '3'
GROUP BY `wobbles`.`id`
Note the select
and group by
. These have different variables:
`test_wobble`.`test_id`
`wobbles`.`id`
I'm not sure which one you really want. But MySQL returns an indeterminate value when you run the query -- and a value that can change from one run to the next. You should fix the select
and group by
so they match.
Upvotes: 0