Aman
Aman

Reputation: 353

Remove consecutive duplicate rows in Postgresql

I have a table having 3 columns (id, amount, time). The table is arranged in ascending order based on id. I want to remove all consecutive rows that have same value in the amount column except the first one.

eg:

id: 12,15,16,17,19,22
amount: 555,546,546,53,53,500
time:

Answer:

id:12,15,17,22    amount: 555,546,53,500    time:

The time column is same for consecutive rows whenever the consecutive rows have same values in the amount column. I searched on SO but could not find a similar question

Upvotes: 1

Views: 1687

Answers (2)

Boy
Boy

Reputation: 26

This might not be an effective method but it works faster if you have large table (>2 million rows).

  1. Copy the table as csv format with DISTINCT on amount column:

    \COPY (SELECT DISTINCT ON (amount) * FROM Tablename) to '/home/user/table.csv' with csv delimiter ',';

  2. Truncate the previous table :

    TRUNCATE TABLE Tablename;

  3. Copy back the dumped table :

    \COPY Tablename from '/home/user/table.csv' with csv delimiter ',';

I've tried deleting duplicates but it took me a day for the query to complete. This method serves me well.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270483

You can use lag():

select t.*
from (select t.*, lag(amount) over (order by id) as prev_amount
      from t
     ) t
where prev_amount is distinct from amount;

Upvotes: 6

Related Questions