Filtering out rows that have an identical column

Question

I have a complex query which returns a table with a column that has some duplicate values, and a second column with integers.

Here is a simple representation of what the data looks like:

col1   col2
===========
A      null
A      1
A      1
A      2
A      3

B      2
B      3

C      4

D      null

My requirement is to select all rows whose col2 value is null or the minimum when grouped by col1.

Thus, my expected result is:

col1   col2
===========
A      null
A      1
A      1

B      2

C      4

D      null

In other words, for each distinct value in col1, find the corresponding minimum in col2, and filter out any rows which are NOT NULL and greater than that value.

I tried doing grouping but that of course combines the rows. I feel like there's a somewhat straightforward solution here that I'm missing.

Gordon Linoff · Accepted Answer

You can use or and a correlated subquery:

select t.*
from t
where t.col2 is null or
      t.col2 = (select min(t2.col2) from t t2 where t2.col1 = t.col1);

Repeating a complex query is cumbersome. You can also do this with window functions:

select col1, col2
from (select t.*, min(col2) over (partition by col1) as min_col2
      from t
     ) t
where col2 is null or col2 = min_col2;

Filtering out rows that have an identical column

Answers (2)

Related Questions