jamee
jamee

Reputation: 573

Insert into table unique rows by postgresql

I have some code stats in greenplum table A

| id  | file   | repo | lang | line |
-------------------------------------
| a   | /a.txt | r1   | txt  | 3    |
| a   | /b.c   | r1   | c    | 5    |
| b   | /x.java| r1   | java | 33   |
| c   | /f.cpp | r2   | c++  | 23   |
| a   | /a.txt | r3   | txt  | 3    |
| a   | /b.c   | r3   | c    | 5    |

but the last two rows code indicate this code is come form repo r1, because the commit id is same with first two rows. I want to remove the duplicate rows, and insert result to table B:

| id  | file   | repo | lang | line |
-------------------------------------
| a   | /a.txt | r1   | txt  | 3    |
| a   | /b.c   | r1   | c    | 5    |
| b   | /x.java| r1   | java | 33   |
| c   | /f.cpp | r2   | c++  | 23   |

the row can be distinct by: id + file + repo

Thanks in advance.

Upvotes: 0

Views: 46

Answers (2)

Salman Arshad
Salman Arshad

Reputation: 272226

You can use NOT EXISTS to check that a duplicate does not exist:

SELECT *
FROM t
WHERE NOT EXISTS (
    SELECT 1
    FROM t AS x
    WHERE x.id   = t.id
    AND   x.file = t.file
    AND   x.repo < t.repo
)

SQL Fiddle

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270201

Aggregation would seem to do what you want:

select id, file, min(repo) as repo, lang, line
from t
group by id, file, lang, line;

Upvotes: 0

Related Questions