Reputation: 15680
I'm trying to construct a query in which I update table_a
with data contained in both table_b
and table_c
.
All the UPDATE ... FROM
docs and questions I've found seem to be focused on updating table_a
with data in table_c
by joining through table_b
. I don't want that, I need to use data contained in both table_b
and table_c
, which are both left joined to table_a
An example select would be:
SELECT
a.id
, b.counted as count_b
, c.counted as count_c
FROM
table_a a
LEFT JOIN
table_b b on a.id = b.id
LEFT JOIN
table_c c on a.id = c.id
;
id | count_b | count_c
----+---------+---------
34 | 6 | 2
43 | 16 | 2
45 | 33 | 5
Working 'one table'-updates are:
UPDATE table_a SET count_b = table_b.counted
FROM table_b WHERE table_a.id = table_b.id;
UPDATE table_a SET count_c = table_c.counted
FROM table_c WHERE table_a.id = table_c.id;
However I'd like to combine them for efficiency (update is an insert+delete, so there's a bit of savings to be had. But I can't seem to figure out how to join two tables onto table_a
in an UPDATE ... FROM
.
I could handle this with a subquery - but I'd prefer avoiding that.
Upvotes: 1
Views: 75
Reputation: 656586
To shorten my longish answer I work with table names a
, b
, c
.
If a.id
, b.id
and c.id
are UNIQUE
(but can be missing):
Basic form with a number of caveats (keep reading!):
UPDATE a
SET count_b = b.counted
, count_c = c.counted
FROM b
FULL JOIN c USING (id)
WHERE a.id = COALESCE(b.id, c.id)
;
Wouldn't work properly if any of the id
columns are not UNIQUE
- but neither would separate updates.
If nothing actually changes (all counted
columns identical to what's in a
already) you would just create a dead row for full cost. If that can be the case and you want to avoid it, append an additional WHERE
clause:
AND (a.count_b <> b.counted OR
a.count_c <> c.counted) -- simple form with all columns NOT NULL
FULL [OUTER] JOIN
makes it work if some, but not all tables (one in the simple case of two source tables) does not have a row for the same id
. It effectively emulates multiple LEFT JOIN
you had in mind, with a small but important difference: You get phantom NULL
values for columns from tables where the row is missing. Catch that with COALESCE
:
SET count_b = COALESCE(b.counted, a.count_b)
There is still a corner case if counted
can be NULL
. COALESCE
cannot distinguish between two cases:
1. b.counted IS NULL
.
2. b.counted
does not exist.
If you need the distinction, use a CASE
statement instead:
SET count_b = CASE WHEN b.id IS NULL THEN a.count_b ELSE b.counted END
Adapt WHERE
clauses accordingly.
The complete version, covering all NULL
cases and avoiding empty updates:
UPDATE a
SET count_b = CASE WHEN b.id IS NULL THEN a.count_b ELSE b.counted END
, count_c = CASE WHEN c.id IS NULL THEN a.count_b ELSE c.counted END
FROM b
FULL JOIN c USING (id)
WHERE a.id = COALESCE(b.id, c.id)
AND (b.id IS NOT NULL AND b.counted IS DISTINCT FROM a.count_b OR
c.id IS NOT NULL AND c.counted IS DISTINCT FROM a.count_c)
;
Looks complex, but should be faster if there is substantial overlap between the source tables.
If there is no matching id
in any of the source tables, the according row in a
is not updated, which is typically what you want. You could use correlated subqueries instead to update all rows, no matter what - which is typically what you don't want, but your case may differ. Correlated subqueries are also typically much more expensive for big tables.
UPDATE a
SET count_b = (SELECT b.counted FROM b WHERE b.id = a.id)
, count_c = (SELECT c.counted FROM c WHERE c.id = a.id)
Again, you can avoid the NULL
values with:
SET count_b = COALESCE ((SELECT b.counted ...), count_b)
Would still waste empty updates which you are trying to minimize. You could add a WHERE
clause with EXISTS
for that ...
Upvotes: 1
Reputation: 77866
Not the exact query but more like a example. you can try like below, by getting the counted
column from both table_b
and table_b
in a inner query and joining with it to table_a
in update.
UPDATE table_a
SET count_b = tableb_counted,
count_c = tablec_counted
FROM (
SELECT table_b.counted as tableb_counted,
table_b.id,
table_c.counted as tablec_counted
FROM table_b
JOIN table_c
ON table_b.id = table_c.id
) tab
WHERE table_a.id = tab.id
Upvotes: 0