Paul Smith
Paul Smith

Reputation: 466

Postgres create table as from multiple joins

I have a datamining problem with a very large number of attributes (>15,000). I represent the data in multiple tables with a common Id field. Having deleted empty attributes, I want to merge the resulting smaller tables so I can calculate correlations to remove redundent attributes.
I can join the tables easily with:

select * from lrg_bin0 k0
  join lrg_bin1 k1 on k0.id = k1.id
  join lrg_bin2 k2 on k0.id = k2.id
  join lrg_bin3 k3 on k0.id = k3.id

but when I try to create a new table with:

Create table mrg0 as (
  select * from lrg_bin0 k0
    join lrg_bin1 k1 on k0.id = k1.id
    join lrg_bin2 k2 on k0.id = k2.id
    join lrg_bin3 k3 on k0.id = k3.id
) 

Postgress complains with 'column "id" specified more than once'

The very large number of attributes means I am doing everything programatically so answers suggesting queries I can generate would be appreciated.

Upvotes: 2

Views: 8444

Answers (2)

Clodoaldo Neto
Clodoaldo Neto

Reputation: 125424

Use using in instead of on

create table mrg0 as 
select *
from
    lrg_bin0 k0
    join lrg_bin1 k1 using (id)
    join lrg_bin2 k2 using (id)
    join lrg_bin3 k3 using (id)

From the manual:

Also, USING implies that only one of each pair of equivalent columns will be included in the join output, not both

Upvotes: 3

astine
astine

Reputation: 286

The problem is that in SQL when you perform a join using on, the matching column will be included once per table it's found in. What you can do is specify which table you want the column from ala:

Create table mrg0 as (
  select k1.id, ... from lrg_bin0 k0
    join lrg_bin1 k1 on k0.id = k1.id
    join lrg_bin2 k2 on k0.id = k2.id
    join lrg_bin3 k3 on k0.id = k3.id
) 

Upvotes: 2

Related Questions