Jaango Jayaraj
Jaango Jayaraj

Reputation: 155

Difference in query and error even though distinct values

It is Bigquery and below are the sample tables and contents

dde-demo-d001.sap_crm.transactions_bkup

case_guid transaction_header_guid
005056935CD81EEF92DF522476D53CAB 00505693-5CD8-1EEF-92DF-8D9FF1D25CAB
005056935CD81EEF92DF522476D53CAB 00505693-801A-1EEF-92DF-6A7A34E0FCAB
005056935CD81EEF92DF522476D53CAB 00505693-5CD8-1EEF-92DF-4D67C93F5CAB
005056935CD81EEF92DF522476D53CAB 00505693-5CD8-1EEF-92DF-6665227E9CAB
005056935CD81EEF92DF522476D53CAB 00505693-5CD8-1EEF-92DF-6DA0046D7CAB

dde-demo-d001.sap_crm.documents_bkup

transaction_header_guid case_guid
00505693-5CD8-1EEF-92DF-6665227E9CAB null
00505693-5CD8-1EEF-92DF-8D9FF1D25CAB null
00505693-5CD8-1EEF-92DF-6DA0046D7CAB null
00505693-801A-1EEF-92DF-6A7A34E0FCAB null
00505693-5CD8-1EEF-92DF-4D67C93F5CAB null

query with inner join

UPDATE `dde-demo-d001.sap_crm.documents_bkup`
    SET case_guid = ICT.case_guid
    FROM `dde-demo-d001.sap_crm.documents_bkup` DFR
    INNER JOIN `dde-demo-d001.sap_crm.transactions_bkup` ICT
    ON DFR.transaction_header_guid = ICT.transaction_header_guid
    WHERE DFR.case_guid IS NULL;

query with subquery

UPDATE `dde-demo-d001.sap_crm.documents_bkup` DFR
    SET case_guid = ICT.case_guid
    FROM (SELECT transaction_header_guid,case_guid FROM `dde-demo-d001.sap_crm.transactions_bkup`) ICT
    WHERE (DFR.case_guid IS NULL) and (DFR.transaction_header_guid = ICT.transaction_header_guid);

query with inner join fails

UPDATE/MERGE must match at most one source row for each target row

query with subquery passed updating 5 rows.

I don't understand why inner join fails as the column on which the join is happening has distinct values?

Also the subquery also same logic and how it passes?

Upvotes: 0

Views: 127

Answers (2)

Jaango Jayaraj
Jaango Jayaraj

Reputation: 155

Using the inner join there are two joins. During the second join we don't specify a column to join on. It combines all rows for each row. Behind the scenes update is a join.

So specify the column clause for the second join:

UPDATE `dde-demo-d001.sap_crm.documents_bkup_copy` trg
    SET case_guid = ICT.case_guid
    FROM `dde-demo-d001.sap_crm.documents_bkup_copy` DFR
    INNER JOIN `dde-demo-d001.sap_crm.transactions_bkup` ICT
    ON DFR.transaction_header_guid = ICT.transaction_header_guid
    WHERE DFR.case_guid IS NULL
    AND trg.transaction_header_guid=ICT.transaction_header_guid;

using innerjoin there are two joins

using subquery

Upvotes: 1

PUTHINEEDI RAMOJI RAO
PUTHINEEDI RAMOJI RAO

Reputation: 363

The reason for the problem UPDATE/MERGE must match at most one source row for each target row is that your INNER JOIN in the first query tries to update rows in documents_bkup when there are several matching rows in transactions_bkup depending on transaction_header_guid. Because an INNER JOIN returns all matching rows from both tables, the update statement becomes unclear which case_guid to use if a transaction_header_guid from documents_bkup appears more than once in transactions_bkup with various case_guid values. Due this ambiguity, bigquery throws an error. Refer to this documentation for more information.

Using aggregate functions in the JOIN clause to choose a single case_guid for each transaction_header_guid may be necessary to use the INNER JOIN successfully (e.g., MIN(case_guid), MAX(case_guid)) And also refer this stack link1, link2.

Upvotes: 1

Related Questions