Insert into table unique rows by postgresql

Question

I have some code stats in greenplum table A

| id  | file   | repo | lang | line |
-------------------------------------
| a   | /a.txt | r1   | txt  | 3    |
| a   | /b.c   | r1   | c    | 5    |
| b   | /x.java| r1   | java | 33   |
| c   | /f.cpp | r2   | c++  | 23   |
| a   | /a.txt | r3   | txt  | 3    |
| a   | /b.c   | r3   | c    | 5    |

but the last two rows code indicate this code is come form repo r1, because the commit id is same with first two rows. I want to remove the duplicate rows, and insert result to table B:

| id  | file   | repo | lang | line |
-------------------------------------
| a   | /a.txt | r1   | txt  | 3    |
| a   | /b.c   | r1   | c    | 5    |
| b   | /x.java| r1   | java | 33   |
| c   | /f.cpp | r2   | c++  | 23   |

the row can be distinct by: id + file + repo

Thanks in advance.

Salman Arshad · Accepted Answer

You can use NOT EXISTS to check that a duplicate does not exist:

SELECT *
FROM t
WHERE NOT EXISTS (
    SELECT 1
    FROM t AS x
    WHERE x.id   = t.id
    AND   x.file = t.file
    AND   x.repo < t.repo
)

SQL Fiddle

Insert into table unique rows by postgresql

Answers (2)

Related Questions