supernova
supernova

Reputation: 2070

Multiple counts for certain values in same column with group by

I have an event table that run different accounts. We I need to count certain types of events by account.

Input:

| accountId | eventType |
|-----------|-----------|
| 1         | start     |
| 1         | stop      |
| 1         | start     |
| 1         | start     |
| 2         | start     |
| 2         | start     |

Query:

select accountId, count(eventType='start') as starts, count(eventType='stop') as stops FROM eventTable GROUP BY accountId

Expected output:

| accountId | starts    | stops     |
|-----------|-----------|-----------|
| 1         | 3         | 1         |
| 2         | 2         | 0         |

It seems it just tries to assign a value and counts the same for all values - which is not intended. There would be a work around using

sum(case when eventType='start' then 1 alse 0) as starts

but is there a better (cleaner) way using the counts direcly?

Edit: I'm using Athena, so HIVE/Presto Syntax and behaviour is preferred.

Upvotes: 0

Views: 1542

Answers (1)

GMB
GMB

Reputation: 222512

Your query is missing a from clause but I assume this is a typo.

Consider:

select 
    accountId, 
    sum(eventType = 'start') as starts, 
    sum(eventType = 'stop') as stops 
from ???
GROUP BY accountId

Rationale: count() takes in account all values that are not null. On the other hand, as long as |eventType is not null, the condition inside the count() returns a boolean or a 0/1 valu - depending on your database. What you want is to sum() these 0/1 values.

Note that the above syntax is supported in MySQL only. If you are running Postgres (which is another database in which your original code would run), no need for this, you can use a filer clause instead:

select 
    accountId, 
    count(*) filter(where eventType = 'start') as starts, 
    count(*) filter(where eventType = 'stop') as stops 
from ???
GROUP BY accountId

Upvotes: 1

Related Questions