Marlon Abeykoon
Marlon Abeykoon

Reputation: 12465

UPSERT BigQuery

I have a Bigqury table with 40 columns. I want to upsert data to it using PHP code. I have the data to upsert in an array of array as follows.

[
  0 => array:2 [
    "id" => 1
    "name" => "updated name 1 "
  ]
  1 => array:2 [
    "id" => 3
    "name" => "new name to insert"
  ]
]

I found in this post, there exists a MERGE command. I tried something like this.

MERGE dataset.table c
USING (SELECT * FROM UNNEST([(1, 'updated name 1'),
                             (3, 'new name to insert')])) d
ON c.id = d.id
WHEN NOT MATCHED THEN
  INSERT ROW
WHEN MATCHED THEN
  UPDATE SET c.name = d.name;

This fails, how can I populate data dynamically to d? The array may have few needed column data.

I referred this as well.

Upvotes: 2

Views: 6214

Answers (1)

Yun Zhang
Yun Zhang

Reputation: 5503

You can do UPSERT with partial columns (as long as these "ignored" columns are NULLABLE, which is default) for sure, see example below using temp table to demonstrate the idea:

CREATE TEMP TABLE c AS 
  SELECT 1 id, "old name" name, NULL col1, NULL col2, NULL col3, NULL col4;

SELECT * FROM c; -- output before

MERGE c
USING UNNEST([struct<id INT64, name STRING>
              (1, 'updated name 1'),
              (3, 'new name to insert')]) d
ON c.id = d.id
WHEN NOT MATCHED THEN
  INSERT (id, name) VALUES (d.id, d.name)
WHEN MATCHED THEN
  UPDATE SET c.name = d.name;

SELECT * FROM c; -- output after

Output before MERGE

+----+----------+------+------+------+------+
| id |   name   | col1 | col2 | col3 | col4 |
+----+----------+------+------+------+------+
|  1 | old name | NULL | NULL | NULL | NULL |
+----+----------+------+------+------+------+

Output after MERGE

+----+--------------------+------+------+------+------+
| id |        name        | col1 | col2 | col3 | col4 |
+----+--------------------+------+------+------+------+
|  3 | new name to insert | NULL | NULL | NULL | NULL |
|  1 | updated name 1     | NULL | NULL | NULL | NULL |
+----+--------------------+------+------+------+------+

Caveat

Your query is subject to length limit. So don't UPSERT too many rows in one query.

Maximum unresolved standard SQL query length — 1 MB

I also suggest you to use query parameter to pass your data instead of putting them in query.

Scalability

If daily update volume exceeds, say 200MB, likely breaking up the data into too many MERGE makes it closer to hit DML daily limit.

Maximum number of combined INSERT, UPDATE, DELETE, and MERGE statements per day per table — 1,000

You should consider other way to do UPSERT. One of the ways is to upload a csv/json/parquet file to GCS and load to a BigQuery table, then do a similar MERGE statement from the table.

Upvotes: 2

Related Questions