Reputation: 389
For a table such as this:
tblA
A,B,C
1,2,t3a
1,3,d4g
1,2,b5e
1,3,s6u
I want to produce a table that selects distinct on both A and B simultaneously, and still keep one value of C, like so:
tblB
A,B,C
1,2,t3a
1,3,d4g
Seems like this would be simple, but not finding it for the life of me.
DROP TABLE IF EXISTS tblA CASCADE;
SELECT DISTINCT ON (A,B), C
INTO tblB
FROM tblA;
Upvotes: 9
Views: 24460
Reputation: 169
According to postgresql documentation, you need to specify the names, inside and outside of prantenses.
SELECT DISTINCT ON (A,B) A,B,C
INTO tblB
FROM tblA;
The image below is another example, with fake data from my project that demonstrates that the intended syntax works properly.
Upvotes: 0
Reputation: 1279
drop table t;
create table t (a int, b int, c int);
insert into t (a, b, c) values(1, 2, 3);
insert into t (a, b, c) values(1, 3, 4);
insert into t (a, b, c) values(1, 2, 5);
insert into t (a, b, c) values(1, 3, 6);
select distinct on (a,b) a, b, c
from t;
From the docs: "SELECT DISTINCT eliminates duplicate rows from the result. SELECT DISTINCT ON eliminates rows that match on all the specified expressions."
Upvotes: 0
Reputation: 8640
This should do the trick
CREATE TABLE tblB AS (
SELECT A, B, max(C) AS max_of_C FROM tblA GROUP BY A, B
)
Upvotes: 4
Reputation: 12214
Use a view to do the distinct and then join it to the original table to pick one row of column C. Inserting into the target is left for you to figure out. Oh, and you could pick up multiple columns from t, not just c - the only thing is that your subquery needs to find a way to limit it to only one row.
create table t (a int, b int, c int);
create view tv as select distinct a, b from t;
insert into t (a, b, c) values(1, 2, 10);
insert into t (a, b, c) values(1, 2, 20);
insert into t (a, b, c) values(1, 3, 30);
insert into t (a, b, c) values(1, 3, 40);
CREATE TABLE tblB AS (
select tv.a, tv.b, t.c from tv, t
where tv.a = t.a and tv.b = t.b
/* pick smallest ctid which is a unique row id built into postgres */
and t.ctid = (select min(ctid) from t s where s.a = t.a and s.b = t.b);
)
Upvotes: 0
Reputation: 1271141
When you use DISTINCT ON
you should have ORDER BY
:
SELECT DISTINCT ON (A,B), C
INTO tblB
FROM tblA
ORDER BY A, B;
Upvotes: 7