Reputation: 83
I have a table as following (using bigquery):
id | year | month | sales | row_number |
---|---|---|---|---|
111 | 2020 | 11 | 1000 | 1 |
111 | 2020 | 12 | 2000 | 2 |
112 | 2020 | 11 | 3000 | 1 |
113 | 2020 | 11 | 1000 | 1 |
Is there a way in which I can select rows that have row numbers more than one?
For example, my desired output is:
id | year | month | sales | row_number |
---|---|---|---|---|
111 | 2020 | 11 | 1000 | 1 |
111 | 2020 | 12 | 2000 | 2 |
I don't want to just exclusively select rows with row_number = 2 but also row_number = 1 as well.
The original code block I used for the first table result is:
SELECT
id,
year,
month,
SUM(sales) AS sales,
ROW_NUMBER() OVER (PARTITIONY BY id ORDER BY id ASC) AS row_number
FROM
table
GROUP BY
id, year, month
Upvotes: 1
Views: 2351
Reputation: 156
This is what I use, it's similar to @ElapsedSoul answer but from my understanding for static list "IN" is better than using "EXISTS" but I'm not sure if the performance difference, if any, is significant:
Difference between EXISTS and IN in SQL?
WITH T1 AS
(
SELECT
id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY id ASC) AS ROW_NUM
FROM table
GROUP BY id, year, month
)
SELECT *
FROM T1
WHERE id IN (SELECT id FROM T1 WHERE ROW_NUM > 1);
Upvotes: 0
Reputation: 825
Try this:
with tmp as (SELECT id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month)
select * from tmp a where exists ( select 1 from tmp b where a.id = b.id and b.row_number =2)
It's a so clearly exists statement
SQL
Upvotes: 1
Reputation: 173028
You can wrap your query as in below example
select * except(flag) from (
select *, countif(row_number > 1) over(partition by id) > 0 flag
from (YOUR_ORIGINAL_QUERY)
)
where flag
so it can look as
select * except(flag) from (
select *, countif(row_number > 1) over(partition by id) > 0 flag
from (
SELECT id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month
)
)
where flag
so when applied to sample data in your question - it will produce below output
Upvotes: 0
Reputation: 1270011
You can use window functions:
select t.* except (cnt)
from (select t.*,
count(*) over (partition by id) as cnt
from t
) t
where cnt > 1;
As applied to your aggregation query:
SELECT iym.* EXCEPT (cnt)
FROM (SELECT id, year, month,
SUM(sales) as sales,
ROW_NUMBER() OVER (Partition by id ORDER BY id ASC) AS row_number
COUNT(*) OVER(Partition by id ORDER BY id ASC) AS cnt
FROM table
GROUP BY id, year, month
) iym
WHERE cnt > 1;
Upvotes: 2