Reputation: 169
A SQL question: I have a table game
with columns user_id
(unique per user), game_id
(unique per game), game_start_timestamp_utc
(the UTC timestamp for when the game starts), and game_status
, which can either be ‘pass’, ‘in progress’ or ‘fail’.
The question is to write a query to return the game that has the lowest pass rate (pass users/enrolled users).
The table should be like this
user_id game_id game_start_timestamp_utc game_status
-----------------------------------------------------
1 111 10/22/2019 pass
2 111 10/21/2018 fail
...
I know how to do it in Python pandas, just need group by game_id
to calculate pass rate, but have not much idea to do it in SQL. Thanks in advance.
Upvotes: 2
Views: 42
Reputation: 222432
Use conditional aggregation. avg()
comes handy for this:
select game_id,
avg(case when game_status = 'pass' then 1.0 else 0 end) as pass_rate
from game
group by game_id
order by pass_rate
This gives you the pass rate of each game, as a value between 0 and 1, ordered by increasing rate - so the first row is the result you want.
You can keep that one row only with a row-limiting clause. The syntax varies across databases: limit 1
, top (1)
, fetch first row
, ...
Upvotes: 2