Reputation: 191
I have a table that contains a user's name, market, and purchase_id. I'm trying to use a window function in SnowSql without a subquery to count the distinct number of purchases a user bought as well as the total number of unique purchases for the market.
Initial Table
User | Market | Purchase_ID |
---|---|---|
John Smith | NYC | 1 |
John Smith | NYC | 2 |
Bob Miller | NYC | 2 |
Bob Miller | NYC | 4 |
Tim Wilson | NYC | 3 |
The desired result would look like this:
User | Purchases | Unique Market Purchases |
---|---|---|
John Smith | 2 | 4 |
Bob Miller | 2 | 4 |
Tim Wilson | 1 | 4 |
The query I've been attempting without a subquery looks like the below but receives an error with the groupby.
SELECT
user,
COUNT(DISTINCT purchase_id),
COUNT(DISTINCT purchase_id) OVER (partition by market)
FROM table
GROUP BY 1
Appreciate any assistance with this. Thanks!
Upvotes: 2
Views: 1246
Reputation: 1804
This might work , you could wrangle to get into the format you're after but it produces the answer without subquery.
Uses the awesome GROUPING SETS which allows multiple group-by clauses in a single statement - the exact error you were hitting :-).
Awesome question!
SELECT
COUNT(DISTINCT PURCHASE_ID)
, USER_NAME
, MARKET
FROM
CTE
GROUP BY
GROUPING SETS (USER_NAME, MARKET);
Copy|Paste|Run
WITH CTE AS (SELECT 'JOHN SMITH' USER_NAME, 'NYC' MARKET, 1
PURCHASE_ID
UNION SELECT 'JOHN SMITH' USER_NAME, 'NYC' MARKET, 2 PURCHASE_ID
UNION SELECT 'BOB MILLER' USER_NAME, 'NYC' MARKET, 2 PURCHASE_ID
UNION SELECT 'BOB MILLER' USER_NAME, 'NYC' MARKET, 4 PURCHASE_ID
UNION SELECT 'TIM WILSON' USER_NAME, 'NYC' MARKET, 3 PURCHASE_ID)
SELECT
COUNT(DISTINCT PURCHASE_ID)
, USER_NAME
, MARKET
FROM
CTE
GROUP BY
GROUPING SETS (USER_NAME, MARKET);
Upvotes: 1
Reputation: 49375
DISTTNCT
in a window function is not allowed, so you need to use the subquery
CREATE TABLE table1 (`User` varchar(10), `Market` varchar(3), `Purchase_ID` int) ; INSERT INTO table1 (`User`, `Market`, `Purchase_ID`) VALUES ('John Smith', 'NYC', 1), ('John Smith', 'NYC', 2), ('Bob Miller', 'NYC', 2), ('Bob Miller', 'NYC', 4), ('Tim Wilson', 'NYC', 3) ;
SELECT user, COUNT(DISTINCT purchase_id) ,MAX((SELECT COUNT(DISTINCT purchase_id) FROM table1 WHERE `Market` = t1.`Market` )) bymarkte FROM table1 t1 GROUP BY 1
user | COUNT(DISTINCT purchase_id) | bymarkte :--------- | --------------------------: | -------: Bob Miller | 2 | 4 John Smith | 2 | 4 Tim Wilson | 1 | 4
SELECT user, COUNT(DISTINCT purchase_id) ,MAX(countr) bymarkte FROM table1 t1 INNER JOIN (SELECT `Market`,COUNT(DISTINCT purchase_id) countr FROM table1 GROUP BY `Market` ) ta ON t1.`Market` = ta.`Market` GROUP BY 1
user | COUNT(DISTINCT purchase_id) | bymarkte :--------- | --------------------------: | -------: Bob Miller | 2 | 4 John Smith | 2 | 4 Tim Wilson | 1 | 4
db<>fiddle here
Upvotes: 1
Reputation: 1269873
I don't think you can do this simply as an aggregation. But you can get the answer like this:
SELECT user,
SUM( (seqnum = 1)::INT ) as purchases,
SUM(SUM( (seqnum = 1)::INT )) OVER (PARTITION BY market) as market_purchases
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY purchase_id ORDER BY purchase_id) as seqnum
FROM table t
) t
GROUP BY 1
Upvotes: 1