quack
quack

Reputation: 383

How to delete rows from table where some columns (but not all) are duplicate in postgresql?

I read previous questions but they are not similar to mine (example this has no unique identifier while I have) .

say this are my tables rows:

  id     string       x    y    z    time_x                 
10318    'hello'   33700  5053  8  2015-07-03 12:39:49.61408
14071    'hello'   33700  5053  8  2015-07-03 12:39:49.61408

basically everything is the same except the id of the row. How can I identify these cases and delete the redundant rows?

Note: not all rows in table are duplicated. Regular rows should be left as is.

Upvotes: 1

Views: 68

Answers (2)

hruske
hruske

Reputation: 2253

This query uses window function to find duplicated rows. It leaves the row with smallest id:

delete from foo_table where id in (
    select dupid from (
        SELECT lead(id) over (partition by string, x, y, z, time_x order by id) dupid
        from foo_table
    ) A where dupid is not null 
);

Upvotes: 0

dabal
dabal

Reputation: 420

Let say that table name is tbl

Delete from tbl where id not in (select min(id) from tbl group by string, x, y, z)

Upvotes: 5

Related Questions