hustlecorean
hustlecorean

Reputation: 83

Selecting rows that have row_number more than 1

I have a table as following (using bigquery):

id year month sales row_number
111 2020 11 1000 1
111 2020 12 2000 2
112 2020 11 3000 1
113 2020 11 1000 1

Is there a way in which I can select rows that have row numbers more than one?

For example, my desired output is:

id year month sales row_number
111 2020 11 1000 1
111 2020 12 2000 2

I don't want to just exclusively select rows with row_number = 2 but also row_number = 1 as well.

The original code block I used for the first table result is:

SELECT 
    id, 
    year, 
    month, 
    SUM(sales) AS sales, 
    ROW_NUMBER() OVER (PARTITIONY BY id ORDER BY id ASC) AS row_number
FROM 
    table
GROUP BY 
    id, year, month

Upvotes: 1

Views: 2351

Answers (4)

JTD2021
JTD2021

Reputation: 156

This is what I use, it's similar to @ElapsedSoul answer but from my understanding for static list "IN" is better than using "EXISTS" but I'm not sure if the performance difference, if any, is significant:

Difference between EXISTS and IN in SQL?

WITH T1 AS 
(
SELECT 
id, 
year, 
month, 
SUM(sales) as sales, 
ROW_NUMBER() OVER(PARTITION BY id ORDER BY id ASC) AS ROW_NUM
FROM table
GROUP BY id, year, month
)
SELECT * 
FROM T1
WHERE id IN (SELECT id FROM T1 WHERE ROW_NUM > 1);

Upvotes: 0

ElapsedSoul
ElapsedSoul

Reputation: 825

Try this:

with tmp as (SELECT id, 
year, 
month, 
SUM(sales) as sales, 
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month)
select * from tmp a where exists ( select 1 from tmp b where a.id = b.id and b.row_number =2)

It's a so clearly exists statement SQL

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173028

You can wrap your query as in below example

select * except(flag) from (
  select *, countif(row_number > 1) over(partition by id) > 0 flag 
  from (YOUR_ORIGINAL_QUERY)
)
where flag   

so it can look as

select * except(flag) from (
  select *, countif(row_number > 1) over(partition by id) > 0 flag 
  from (
    SELECT id, 
    year, 
    month, 
    SUM(sales) as sales, 
    ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
    FROM table
    GROUP BY id, year, month
  )
)
where flag   

so when applied to sample data in your question - it will produce below output

enter image description here

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1270011

You can use window functions:

select t.* except (cnt)
from (select t.*,
             count(*) over (partition by id) as cnt
      from t
     ) t
where cnt > 1;

As applied to your aggregation query:

SELECT iym.* EXCEPT (cnt)
FROM (SELECT id, year, month, 
             SUM(sales) as sales, 
             ROW_NUMBER() OVER (Partition by id ORDER BY id ASC) AS row_number
             COUNT(*) OVER(Partition by id ORDER BY id ASC) AS cnt
      FROM table
      GROUP BY id, year, month
     ) iym
WHERE cnt > 1;

Upvotes: 2

Related Questions