Dhirendra Gautam
Dhirendra Gautam

Reputation: 857

BigQuery Equivalent of Merge Statement

I am performing migration from teradata to Big query. I have encountered a merge statement having VALUES in USING clause.

MERGE INTO department DL
                        USING VALUES
                        (
                        2,'ABC'
                        ) AS V 
                        (Run_Id, Country) 
                          ON DL.department_id = V.Run_Id
                        WHEN MATCHED THEN
                          UPDATE SET 
                            department_description = V.country
                        WHEN NOT MATCHED THEN
                          INSERT
                          (
                          V.Run_Id
                          , V.Country
                          curr
                          ); 

Can anyone help me out in finding what would be its BigQuery equivalent.

Upvotes: 4

Views: 11830

Answers (1)

Alexandre Moraes
Alexandre Moraes

Reputation: 4042

The MERGE statement is used when you want to updated a target table using a source table with one or more primary keys(PK).

According to the documentation, the differences between Teradata's and BigQuery's MERGE are:

Teradata's MERGE operation is limited to matching primary keys within one access module processor (AMP). In contrast, BigQuery has no size or column limitation for MERGE operations, therefore using MERGE is a useful optimization. However, if the MERGE is primarily a large delete, see optimizations for DELETE elsewhere in this document.

DML scripts in BigQuery have slightly different consistency semantics than equivalent statements in Teradata. For example, Teradata's SET tables in session mode might ignore duplicates during a MERGE operation. For an overview on handling MULTISET and SET tables, snapshot isolation, and session and transaction handling, see the CREATE INDEX section elsewhere in this document.

In your case, it seems you are using the PK as DL.department_id and V.Run_Id. Although, within your syntax inside the USING clause you should specify the targeted table not only its fields. Below is the syntax, link:

MERGE target_name [[AS] alias]
USING source_name
ON merge_condition
#WHEN MATCHED 
#WHEN NOT MATCHED

Therefore, in your case, the syntax will be:

MERGE dataset.department DL
USING (SELECT * FROM `project_id.dataset.source_table`) V
ON DL.department_id = V.Run_Id
WHEN MATCHED THEN
UPDATE SET DL.department_description = V.country
WHEN NOT MATCHED
#first specify the name of the columns in your then the values to insert
INSERT(colum1, column2, column3) VALUES(V.Run_Id, V.Country, V.curr)

Notice that within the INSERT clause you first specify the columns that data will be added then inside VALUES the values to be inserted, you can write the values explicitly or name the columns from your source_table with the data to be added. I would like to point that, I considered curr as a column from your source table. Also, you did not stated your source table, only some of its fields.

For further clarification, below is another example

MERGE `dataset.target_table` T
USING (SELECT "New value" as value, "1" as ID) S
ON T.ID = S.ID
WHEN MATCHED THEN
UPDATE SET T.value_column = S.value
WHEN NOT MATCHED THEN
INSERT(value_column, id) VALUES("Value added", s.ID)

Again, pay attention to the INSERT clause, first the columns from the target table are described then the values which will be inserted in the table WHEN NOT MATCHED.

Upvotes: 5

Related Questions