David Masip
David Masip

Reputation: 2511

Case when with aggregation in BigQuery

I have data of how much users spend in several games in BigQuery:

CREATE TABLE if not EXISTS user_values (
  user_id int,
  value float,
  game char
);

INSERT INTO user_values VALUES
(1, 10, 'A'),
(1, 10, 'A'),
(1, 2, 'A'),
(1, 4, 'B'),
(1, 5, 'B'),
(2, 0, 'A'),
(2, 10, 'B'),
(2, 6, 'B');

I want to check, for every user, if they've spent more than 20 in game A and more than 15 in game B. In this case, the output table should be:

user_id,game,spent_more_than_cutoff
1,A,TRUE
1,B,FALSE
2,A,FALSE
2,B,TRUE

I want to do this for an arbitrary number of users and 5-10 games. I've tried this:

    select
        game,
        user_id,
        case
            when sum(value) > 20 and game = 'A' then TRUE
            when sum(value) > 15 and game = 'B' then TRUE
            else FALSE
        end as spent_more_than_cutoff,
    from user_values
    group by 1, 2

but I get thrown the following error:

Column 3 contains an aggregation function, which is not allowed in GROUP BY at [19:20]

What's the simplest way of doing this in BigQuery without needing to do different queries for different games?

Is there an all function that can help to do something like this?

    select
        game,
        user_id,
        case
            when sum(value) > 20 and all(game) = 'A' then TRUE
            when sum(value) > 15 and all(game) = 'B' then TRUE
            else FALSE
        end as spent_more_than_cutoff,
    from user_values
    group by 1, 2

Upvotes: 1

Views: 2080

Answers (3)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

I want to do this for an arbitrary number of users and 5-10 games

Consider below approach

with cutoffs as (
  select 'A' game, 20 cutoff union all 
  select 'B', 15
)
select user_id, game, 
  sum(value) > any_value(cutoff) spent_more_than_cutoff
from user_values
left join cutoffs using(game)
group by user_id, game    

If applied to sample data for user_values in your question - output is

enter image description here

Upvotes: 1

Sergey Geron
Sergey Geron

Reputation: 10212

Try this one:

select game,
       user_id,
       sum(if(game = 'A', value, 0)) > 20 or sum(if(game = 'B', value, 0)) > 15 as spent_more_than_cutoff
from user_values
group by 1, 2;

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270401

The expression for filtering on the game needs to be the argument to the sum():

select game, user_id,
       (sum(case when game = 'A' then value end) > 20 and
        sum(case when game = 'B' then value end) > 15 
       ) as spent_more_than_cutoff
from user_values
group by 1, 2;

Note that you are returning a boolean so no case is needed.

Upvotes: 1

Related Questions