Jennifer
Jennifer

Reputation: 49

UPDATE Statement ignores INNER JOIN

I expect the SQL statement to update a limited amounts of rows, instead of every row:

UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_1 AS ST1
INNER JOIN Join_Test_2 AS ST2 ON ST1.PKID = ST2.PKID;

I expect the statement to behave the same as the statements:

UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_2
WHERE Join_Test_1.PKID = Join_Test_2.PKID;

UPDATE Join_Test_1
SET ANumber = 3
WHERE EXISTS (SELECT * FROM Join_Test_2 
              WHERE Join_Test_2.PKID = Join_Test_1.PKID);

UPDATE Join_Test_1
SET ANumber = 3
WHERE PKID IN (SELECT PKID FROM Join_Test_2);

The statements using the WHERE clause only update the ANumber column where the PKID matches.

The statement using the INNER JOIN updates all rows in the table.

Why does the INNER JOIN not limit the number of rows updated?

Can the statement with the INNER JOIN be rewritten to use the JOIN to limit the number of rows updated?

/* Expansive Example */
CREATE TABLE Join_Test_1 (PKID SERIAL, ANumber INTEGER);
CREATE TABLE Join_Test_2 (PKID SERIAL, ANumber INTEGER);

INSERT INTO Join_Test_1 (ANumber) VALUES (1), (1);      
INSERT INTO Join_Test_2 (ANumber) VALUES (2);       

UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_1 AS ST1
INNER JOIN Join_Test_2 AS ST2 ON ST1.PKID = ST2.PKID; -- Updates 2

SELECT * 
FROM Join_Test_1 
ORDER BY PKID;

-- 1, 3
-- 2, 3

UPDATE Join_Test_1 
SET ANumber = 1; -- Update 2

UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_2
WHERE Join_Test_1.PKID = Join_Test_2.PKID;

SELECT * 
FROM Join_Test_1 
ORDER BY PKID;

-- 1, 1
-- 2, 3     

UPDATE Join_Test_1 SET ANumber = 1; -- Update 2

UPDATE Join_Test_1
SET ANumber = 3
WHERE PKID IN (SELECT PKID FROM Join_Test_2);

SELECT * 
FROM Join_Test_1 
ORDER BY PKID;

-- 1, 1
-- 2, 3     

DROP TABLE IF EXISTS Join_Test_1;
DROP TABLE IF EXISTS Join_Test_2;

Upvotes: 1

Views: 96

Answers (4)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 658422

Your UPDATE statement is simply wrong. No connection is established between the target table Join_Test_1 and the second instance Join_Test_1 AS ST1. You don't need that second instance of the target table in Postgres to begin with - except for the rare case of emulating a LEFT JOIN. See:

This would work:

UPDATE join_test_1
SET    anumber = 3
FROM   join_test_2 st2 USING (pkid);

I also simplified and did away with unhelpful CaMeL-case spelling. See:

But this is still a dubious statement. It all comes down to bare existence of a matching row in join_test_2. With the join you run the risk of multiple matches, which produces arbitrary results. Does not show in the simplistic example with a constant assignment (unhelpful test!), but should be avoided in any case. Don't update the same row multiple times.

The right query for the purpose is your EXISTS variant, which can be simplified to:

UPDATE join_test_1 t1
SET    anumber = 3
WHERE  EXISTS (SELECT FROM join_test_2 t2 WHERE t2.pkid = t1.pkid);

Upvotes: 1

d r
d r

Reputation: 7846

Arguably, when dealing with updates that involve multiple or complex joins and/or conditions - the most readable and maintainable way to do it is using the MERGE INTO syntax.

MERGE INTO Join_Test_1 target
USING ( Select     t1.pkid 
        From       Join_Test_1 t1
        Inner Join Join_Test_2 t2 ON(t2.pkid = t1.pkid)
      ) src ON(src.pkid = target.pkid)
WHEN MATCHED THEN 
    UPDATE SET anumber = 3;
SELECT * FROM Join_Test_1 ORDER BY PKID;

Result:

pkid anumber
1 3
2 1

fiddle

Just make sure that your USING query fetches unique rows to be matched...

Upvotes: 1

ArtBindu
ArtBindu

Reputation: 2016

In plpgSQL there is an way to update a single table during joing as follows:

CREATE TABLE Join_Test_1 (PKID SERIAL,ANumber INTEGER);
CREATE TABLE Join_Test_2 (PKID SERIAL,ANumber INTEGER);

INSERT INTO Join_Test_1 (ANumber) VALUES (1),(1);       
INSERT INTO Join_Test_2 (ANumber) VALUES (2); 

select * from Join_Test_1;
-- 1    1
-- 2    1

select * from Join_Test_2;
-- 1    2

WITH TT AS (
    UPDATE Join_Test_1
    SET ANumber = 3
    RETURNING *
)
SELECT *
FROM TT          AS ST1
INNER JOIN 
     Join_Test_2 AS ST2
ON   ST1.PKID = ST2.PKID;
-- 1    3   1   2

SELECT * FROM Join_Test_1 ORDER BY PKID;
-- 1    3
-- 2    3

Process-01: db<>fiddle - Update During joining

Process-02: db<>fiddle - Update During joining

Process-03: db<>fiddle - Normal way

Upvotes: 1

Eric
Eric

Reputation: 3257

You alias the table Join_Test_1, but you don't call the alias in the update. That means the FROM statement is ignored. You statement is the same as UPDATE Join_Test_1 SET ANumber = 3.

UPDATE  ST1
SET     ANumber = 3
FROM    Join_Test_1 AS ST1
INNER JOIN  Join_Test_2 AS ST2
ON          ST1.PKID = ST2.PKID;

Upvotes: 0

Related Questions