Jscore
Jscore

Reputation: 405

Join two tables in Postgresql while keeping non-matching rows

I am trying to join a spatial table with a non-spatial table in Postgresql using the PostGIS extension.

spatial_table:

geom   | attribute1 | Key |
foobar | foobar     |  1  |
foobar | foobar     |  2  |
foobar | foobar     |  3  |
foobar | foobar     |  4  |

non_spatial_table:

attribute2 | attribute3 | Key |
foobar     | foobar     |  1  |
foobar     | foobar     |  4  |

joined_table:

geom   | attribute1 | Key | attribute2 | attribute3 |
foobar | foobar     |  1  | foobar     | foobar     |
foobar | foobar     |  2  | NULL       | NULL       |
foobar | foobar     |  3  | NULL       | NULL       |
foobar | foobar     |  4  | foobar     | foobar     |

By NULL I mean empty.

The following code works:

CREATE TABLE joined_table AS
SELECT *
FROM spatial_table
JOIN non_spatial_table ON spatial_table.title_no = non_spatial_table.title_number;

However all the rows in the spatial_table that are not equal to the non_spatial_table are left out of the resultant table.

resultant table:

geom   | attribute1 | Key | attribute2 | attribute3 |
foobar | foobar     |  1  | foobar     | foobar     |
foobar | foobar     |  4  | foobar     | foobar     | 

I have also tried:

ALTER TABLE spatial_table
    ADD COLUMN title_number varchar,
    ADD COLUMN tenure varchar
UPDATE spatial_table
    SET title_number = non_spatial_table.title_number
FROM spatial_table INNER JOIN non_spatial_table ON spatial_table.title_no = non_spatial_table.title_number

However I get the following error:

ERROR: table name "spatial_table" specified more than once SQL state: 42712

Does anyone know how I can achieve this type of join?

Upvotes: 2

Views: 3022

Answers (2)

Jscore
Jscore

Reputation: 405

I tried Don Seilers solution as seen below:

SELECT a.geom, a.title_no, b.title_number, b.tenure
FROM spatial_table a
LEFT OUTER JOIN non_spatial_table b
ON a.title_no = b.title_number;

It runs, but then runs out of memory. I increased the work_mem and shared buffers in postgresql.conf to no effect. Is there a way of making the above code more efficient, or am I missing something with the postgresql.conf setup?

UPDATE: The above code works, the memory issue turned out to be a completely different issue.

Upvotes: 1

Don Seiler
Don Seiler

Reputation: 470

You want a LEFT OUTER JOIN. I mocked up this example with just varchar fields (the spatial part doesn't matter here), joining in on the key field.

postgres@sandbox=# select * from spatial_table;
 key |  geom   | attribute1
-----+---------+------------
   1 | foobar1 | foobar1
   2 | foobar2 | foobar2
   3 | foobar3 | foobar3
   4 | foobar4 | foobar4
(4 rows)

postgres@sandbox=# select * from non_spatial_table;
 key | attribute2 | attribute3
-----+------------+------------
   1 | foobar12   | foobar13
   4 | foobar42   | foobar43
(2 rows)

postgres@sandbox=# select a.geom, a.attribute1, a.key, b.attribute2, b.attribute3
sandbox-# from spatial_table a
sandbox-# left outer join non_spatial_table b
sandbox-# on a.key=b.key;
  geom   | attribute1 | key | attribute2 | attribute3
---------+------------+-----+------------+------------
 foobar  | foobar1    |   1 | foobar12   | foobar13
 foobar2 | foobar2    |   2 | [NULL]     | [NULL]
 foobar3 | foobar3    |   3 | [NULL]     | [NULL]
 foobar4 | foobar4    |   4 | foobar42   | foobar43
(4 rows)

Just use that last query in your CTAS (maybe a view or materialized view would make more sense).

Upvotes: 2

Related Questions