I have a table as following (using bigquery): id year month sales row_number 111 2020 11 1000 1 111 2020 12 2000 2 112 2020 11 3000 1 113 2020 11 1000 1 Is there a way in which I can select rows that have row numbers more than one? For example, my desired output is: id year month sales row_number 111 2020 11 1000 1 111 2020 12 2000 2 I don't want to just exclusively select rows with row_number = 2 but also row_number = 1 as well. The original code block I used for the first table result is: SELECT id, year, month, SUM(sales) AS sales, ROW_NUMBER() OVER (PARTITIONY BY id ORDER BY id ASC) AS row_number FROM table GROUP BY id, year, month

Reputation: 83

Selecting rows that have row_number more than 1

I have a table as following (using bigquery):

id	year	month	sales	row_number
111	2020	11	1000	1
111	2020	12	2000	2
112	2020	11	3000	1
113	2020	11	1000	1

Is there a way in which I can select rows that have row numbers more than one?

For example, my desired output is:

id	year	month	sales	row_number
111	2020	11	1000	1
111	2020	12	2000	2

I don't want to just exclusively select rows with row_number = 2 but also row_number = 1 as well.

The original code block I used for the first table result is:

SELECT 
    id, 
    year, 
    month, 
    SUM(sales) AS sales, 
    ROW_NUMBER() OVER (PARTITIONY BY id ORDER BY id ASC) AS row_number
FROM 
    table
GROUP BY 
    id, year, month

Upvotes: 1

Answers (4)

JTD2021

Reputation: 156

This is what I use, it's similar to @ElapsedSoul answer but from my understanding for static list "IN" is better than using "EXISTS" but I'm not sure if the performance difference, if any, is significant:

Difference between EXISTS and IN in SQL?

WITH T1 AS 
(
SELECT 
id, 
year, 
month, 
SUM(sales) as sales, 
ROW_NUMBER() OVER(PARTITION BY id ORDER BY id ASC) AS ROW_NUM
FROM table
GROUP BY id, year, month
)
SELECT * 
FROM T1
WHERE id IN (SELECT id FROM T1 WHERE ROW_NUM > 1);

Upvotes: 0

ElapsedSoul

Reputation: 825

Try this:

with tmp as (SELECT id, 
year, 
month, 
SUM(sales) as sales, 
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month)
select * from tmp a where exists ( select 1 from tmp b where a.id = b.id and b.row_number =2)

It's a so clearly exists statement SQL

Upvotes: 1

Mikhail Berlyant

Reputation: 173028

You can wrap your query as in below example

select * except(flag) from (
  select *, countif(row_number > 1) over(partition by id) > 0 flag 
  from (YOUR_ORIGINAL_QUERY)
)
where flag

so it can look as

select * except(flag) from (
  select *, countif(row_number > 1) over(partition by id) > 0 flag 
  from (
    SELECT id, 
    year, 
    month, 
    SUM(sales) as sales, 
    ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
    FROM table
    GROUP BY id, year, month
  )
)
where flag

so when applied to sample data in your question - it will produce below output

Upvotes: 0

Gordon Linoff

Reputation: 1270011

You can use window functions:

select t.* except (cnt)
from (select t.*,
             count(*) over (partition by id) as cnt
      from t
     ) t
where cnt > 1;

As applied to your aggregation query:

SELECT iym.* EXCEPT (cnt)
FROM (SELECT id, year, month, 
             SUM(sales) as sales, 
             ROW_NUMBER() OVER (Partition by id ORDER BY id ASC) AS row_number
             COUNT(*) OVER(Partition by id ORDER BY id ASC) AS cnt
      FROM table
      GROUP BY id, year, month
     ) iym
WHERE cnt > 1;

Upvotes: 2

Selecting rows that have row_number more than 1

Answers (4)

Related Questions