Reputation: 62394
I'm kind of losing my mind over this one - maybe I've worked with this too much to see the solution? I have an app that handles registrations for players for each season of a given game. My objective is to build a graph that shows how many new players there are each season for each program (a program is effectively the age group).
This query is what I'm currently using to perform the task. The idea is to pull data for the last 5 seasons:
select
`seasons`.*,
(select COUNT(DISTINCT players.id) from `players`
where exists (
select * from `groups`
inner join `player_season` on `groups`.`id` = `player_season`.`group_id`
where `players`.`id` = `player_season`.`player_id` and
`program_id` = 2 and
`season_id` = seasons.id and
`paid` is not null and
`player_season`.`inactive` is null
) and not exists (
select * from `seasons`
inner join `player_season` on `seasons`.`id` = `player_season`.`season_id`
where `players`.`id` = `player_season`.`player_id` and
`seasons`.`id` < seasons.id
)
) as new_players
from `seasons` order by `seasons`.`id` desc limit 5
You can see that query in action on this db fiddle: https://www.db-fiddle.com/f/k13fnyhecyeGoTKUJvtum9/0. It pulls a player count of 2 for seasonId 2.
When I put this into production it's showing way too high numbers, seems to be obviously pulling in players it shouldn't be. So I removed the dynamic seasons and hard coded the seasonId as 2 and I get only 1 player instead of 2 as demonstrated in this db fiddle: https://www.db-fiddle.com/f/7k998nxexxY5K85a5ppXJN/0. For historical context, here's the single-season query:
select
*
from `players`
where exists (
select *
from `groups`
inner join `player_season` on `groups`.`id` = `player_season`.`group_id`
where `players`.`id` = `player_season`.`player_id` and
`program_id` = 2 and
`season_id` = 2 and
`paid` is not null and
`player_season`.`inactive` is null
) and not exists (
select *
from `seasons`
inner join `player_season` on `seasons`.`id` = `player_season`.`season_id`
where `players`.`id` = `player_season`.`player_id` and
`seasons`.`id` < 2
)
I've been through this query line by line and don't see any differences that could cause this discrepancy. The example data in both db-fiddles is the exact same. Why does this query produce different results when used as a subquery?
Upvotes: 2
Views: 71
Reputation: 11242
The query is using "seasons" table twice, in the main select and the 2nd condition of the subquery.
Use an alias to differentiate the 2 table references:
select
`s1`.*,
(select COUNT(DISTINCT players.id) from `players`
where exists (
select * from `groups`
inner join `player_season` on `groups`.`id` = `player_season`.`group_id`
where `players`.`id` = `player_season`.`player_id` and
`program_id` = 2 and
`season_id` = s1.id and
`paid` is not null and
`player_season`.`inactive` is null
) and not exists (
select * from `seasons` AS s2
inner join `player_season` on `s2`.`id` = `player_season`.`season_id`
where `players`.`id` = `player_season`.`player_id` and
`s2`.`id` < s1.id
)
) as new_players
from `seasons` s1 order by `s1`.`id` desc limit 5
Upvotes: 1