Reputation: 2511
I have data of how much users spend in several games in BigQuery:
CREATE TABLE if not EXISTS user_values (
user_id int,
value float,
game char
);
INSERT INTO user_values VALUES
(1, 10, 'A'),
(1, 10, 'A'),
(1, 2, 'A'),
(1, 4, 'B'),
(1, 5, 'B'),
(2, 0, 'A'),
(2, 10, 'B'),
(2, 6, 'B');
I want to check, for every user, if they've spent more than 20 in game A and more than 15 in game B. In this case, the output table should be:
user_id,game,spent_more_than_cutoff
1,A,TRUE
1,B,FALSE
2,A,FALSE
2,B,TRUE
I want to do this for an arbitrary number of users and 5-10 games. I've tried this:
select
game,
user_id,
case
when sum(value) > 20 and game = 'A' then TRUE
when sum(value) > 15 and game = 'B' then TRUE
else FALSE
end as spent_more_than_cutoff,
from user_values
group by 1, 2
but I get thrown the following error:
Column 3 contains an aggregation function, which is not allowed in GROUP BY at [19:20]
What's the simplest way of doing this in BigQuery without needing to do different queries for different games?
Is there an all
function that can help to do something like this?
select
game,
user_id,
case
when sum(value) > 20 and all(game) = 'A' then TRUE
when sum(value) > 15 and all(game) = 'B' then TRUE
else FALSE
end as spent_more_than_cutoff,
from user_values
group by 1, 2
Upvotes: 1
Views: 2080
Reputation: 173046
I want to do this for an arbitrary number of users and 5-10 games
Consider below approach
with cutoffs as (
select 'A' game, 20 cutoff union all
select 'B', 15
)
select user_id, game,
sum(value) > any_value(cutoff) spent_more_than_cutoff
from user_values
left join cutoffs using(game)
group by user_id, game
If applied to sample data for user_values
in your question - output is
Upvotes: 1
Reputation: 10212
Try this one:
select game,
user_id,
sum(if(game = 'A', value, 0)) > 20 or sum(if(game = 'B', value, 0)) > 15 as spent_more_than_cutoff
from user_values
group by 1, 2;
Upvotes: 1
Reputation: 1270401
The expression for filtering on the game
needs to be the argument to the sum()
:
select game, user_id,
(sum(case when game = 'A' then value end) > 20 and
sum(case when game = 'B' then value end) > 15
) as spent_more_than_cutoff
from user_values
group by 1, 2;
Note that you are returning a boolean so no case
is needed.
Upvotes: 1