pufferfish
pufferfish

Reputation: 291

SQL remove duplicate ordering

Is there a way to do the following projection in SQL? Remove rows where the second column does not change based on the ordering?

FROM          TO
1   504     1   504
2   508     2   508
3   504     3   504
4   504     7   508
5   504     8   504
6   504     9   508
7   508     10  504
8   504     15  508
9   508     16  504
10  504         
11  504         
12  504         
13  504         
14  504         
15  508         
16  504         
17  504         
18  504         
19  504

Upvotes: 1

Views: 406

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269443

To do this with standard SQL, you can use a correlated subquery. The idea is to get the previous value and only keep rows with the current value is different:

select a.*
from (select t.*
             (select max(id) from t t2 where t2.id < t.id) as pevid
      from t
     ) a left outer join
     t aprev
     on a.previd = aprev.id
where aprev.value <> a.value or aprev.value is null

This is really an implementation of the lag() function, but without window functions.

You can also write this using top/limit/rownum and doing an order by in the subquery:

select a.*
from (select t.*
             (select top 1 id from t t2 where t2.id < t.id order by id desc) as pevid
      from t
     ) a left outer join
     t aprev
     on a.previd = aprev.id
where aprev.value <> a.value or aprev.value is null

This, in turn, can be simplified to remove the last join:

select a.*
from (select t.*
             (select top 1 val from t t2 where t2.id < t.id order by id desc) as pevval
      from t
     )
where a.prevval <> a.value or a.prevval is null

Upvotes: 1

Chris Cameron-Mills
Chris Cameron-Mills

Reputation: 4657

Depending on which RDBMS you are using you could use the LAG and LEAD analytic functions to look at the previous/next row.

SELECT
  a.* 
FROM (
  SELECT
    id
  , value
  , LAG(value) OVER (ORDER BY id) previous_value
  FROM some_table
) a
WHERE a.previous_value IS NULL OR a.value != a.previous_value

Here, the inline view pulls your data out including a column that has the previous rows value (when ordered by id). The WHERE clause in the outer query excludes rows where the value is the same as a previous one (and makes sure to include the first row which obviously has a NULL previous_value).

Upvotes: 2

Related Questions