Reputation: 1463
I am trying to do a pivoting on column Join_mon
and get aggregate count for each ID
, as shown in following query;
select *
from CTE3
pivot(COUNT(DISTINCT platform_payer_name) for Join_Mon in (
'2021-03-01',
'2021-02-01',
'2021-01-01',
'2020-12-01'
))
as p
order by ID
)
As you can see I am trying to get distinct count her for the column platform_payer_name
. But it is giving the following error;
SQL compilation error: syntax error line 48 at position 16 unexpected 'DISTINCT'
I am quite positive DISTINCT
works with COUNT
in snowflake. Can I get some help why it is failing here. Help is appreciated.
Upvotes: 0
Views: 1763
Reputation: 135
Sorry for the lazy answer:
I recently had the exact same issue where I needed a count( distinct value ) as part of the pivot command. Snowflake specifically addresses this in their documentation on the section on distinct counts. You have to aggregate the values as an array as detailed in this section. Then you can use array_size
in your select to get what you want.
The relevant part of my query looked like this:
from events pivot (array_unique_agg(version) for node_id in (any))
Upvotes: 0
Reputation: 175586
Snowflake supports COUNT_IF:
SELECT id,
COUNT_IF(join_mon='2021-03-01') AS "2021-03-01",
COUNT_IF(join_mon='2021-02-01') AS "2021-02-01",
COUNT_IF(join_mon='2021-01-01') AS "2021-01-01"
FROM (SELECT DISTINCT id, platform_name, join_mon FROM cte) s
GROUP BY id
ORDER BY id;
Upvotes: 0
Reputation: 25903
so making some fake data that maps to your pivot, albeit I dropped the excessive paren
with cte3(id, platform_payer_name, Join_Mon) as (
select * from values
(1,'aa', '2021-03-01'),
(1,'aa', '2021-03-01'),
(1,'aa', '2021-03-01'),
(1,'aa', '2021-02-01'),
(2,'bb', '2012-03-01'),
(2,'cc', '2020-12-01')
)
select *
from CTE3 AS c
pivot(COUNT(c.platform_payer_name) for c.Join_Mon in (
'2021-03-01',
'2021-02-01',
'2021-01-01',
'2020-12-01' )
) as p
order by id;
gives:
ID '2021-03-01' '2021-02-01' '2021-01-01' '2020-12-01'
1 3 1 0 0
2 0 0 0 1
so makes sense you want distinct
in there
but it seems it's not supported..
so while it's somewhat cut'n'paste error prone, it does "work":
with cte3(id, platform_payer_name, Join_Mon) as (
select * from values
(1,'aa', '2021-03-01'),
(1,'aa', '2021-03-01'),
(1,'aa', '2021-03-01'),
(1,'aa', '2021-02-01'),
(2,'bb', '2012-03-01'),
(2,'cc', '2020-12-01')
)
select id
,count(distinct(iff(Join_Mon='2021-03-01',platform_payer_name,null))) as "2021-03-01"
,count(distinct(iff(Join_Mon='2021-02-01',platform_payer_name,null))) as "2021-02-01"
,count(distinct(iff(Join_Mon='2021-01-01',platform_payer_name,null))) as "2021-01-01"
,count(distinct(iff(Join_Mon='2020-12-01',platform_payer_name,null))) as "2020-12-01"
from CTE3 AS c
group by 1 order by 1;
gives:
ID 2021-03-01 2021-02-01 2021-01-01 2020-12-01
1 1 1 0 0
2 0 0 0 1
which works because pivot is doing two task, the first is moving values into columns if matching the input, thus that is the same as:
with cte3(id, platform_payer_name, Join_Mon) as (
select * from values
(1,'aa', '2021-03-01'),
(1,'aa', '2021-03-01'),
(1,'aa', '2021-03-01'),
(1,'aa', '2021-02-01'),
(2,'bb', '2012-03-01'),
(2,'cc', '2020-12-01')
)
select id
,iff(Join_Mon='2021-03-01',platform_payer_name,null) as "2021-03-01"
,iff(Join_Mon='2021-02-01',platform_payer_name,null) as "2021-02-01"
,iff(Join_Mon='2021-01-01',platform_payer_name,null) as "2021-01-01"
,iff(Join_Mon='2020-12-01',platform_payer_name,null) as "2020-12-01"
from CTE3 AS c
order by 1;
which gives:
ID, 2021-03-01, 2021-02-01, 2021-01-01, 2020-12-01
1, aa, NULL, NULL, NULL
1, aa, NULL, NULL, NULL
1, aa, NULL, NULL, NULL
1, NULL, aa, NULL, NULL
2, NULL, NULL, NULL, NULL
2, NULL, NULL, NULL, cc
which can then have a count(distinct x)
ran over each column.
select id
,count(distinct("2021-03-01")) as "2021-03-01"
,count(distinct("2021-02-01")) as "2021-02-01"
,count(distinct("2021-01-01")) as "2021-01-01"
,count(distinct("2020-12-01")) as "2020-12-01"
from (
select id
,iff(Join_Mon='2021-03-01',platform_payer_name,null) as "2021-03-01"
,iff(Join_Mon='2021-02-01',platform_payer_name,null) as "2021-02-01"
,iff(Join_Mon='2021-01-01',platform_payer_name,null) as "2021-01-01"
,iff(Join_Mon='2020-12-01',platform_payer_name,null) as "2020-12-01"
from CTE3 AS c
)
group by id
order by id;
or can be done inline as I showed in the first answer.
Upvotes: 1