flyingL123
flyingL123

Reputation: 8096

BigQuery MERGE query not working with empty REPEATED field

I am trying to run a MERGE query on a table with the following schema:

[
   {
      "mode":"REQUIRED",
      "name":"container_ref",
      "type":"STRING"
   },
   {
      "mode":"REQUIRED",
      "name":"receive_date",
      "type":"DATE"
   },
   {
      "mode":"REQUIRED",
      "name":"vendor_id",
      "type":"INTEGER"
   },
   {
      "fields":[
         {
            "mode":"NULLABLE",
            "name":"code",
            "type":"STRING"
         },
         {
            "mode":"NULLABLE",
            "name":"quantity",
            "type":"INTEGER"
         }
      ],
      "mode":"REPEATED",
      "name":"products",
      "type":"RECORD"
   },
   {
      "mode":"REQUIRED",
      "name":"max_fill_ratio",
      "type":"FLOAT"
   }
]

I'm trying to run the following query:

MERGE `project.dataset.table` T
USING (
  WITH container_data AS (
    SELECT
        'yongkang_210222' AS container_ref
      , DATE('2021-02-22') AS receive_date
      , 51695 AS vendor_id
      , 0.5 AS max_fill_ratio
      , [] AS products
  )
  SELECT * FROM container_data
) S
ON T.container_ref = S.container_ref
WHEN MATCHED THEN
  UPDATE SET T.products = S.products, T.max_fill_ratio = S.max_fill_ratio
WHEN NOT MATCHED THEN
  INSERT (container_ref, receive_date, vendor_id, products, max_fill_ratio)
  VALUES(container_ref, receive_date, vendor_id, products, max_fill_ratio)
WHEN NOT MATCHED BY SOURCE AND receive_date = '2021-02-22' THEN
  DELETE

As you can see, in the source data, products is an empty array. I receive the following error from the query:

Value of type ARRAY cannot be assigned to T.products, which has type ARRAY<STRUCT<code STRING, quantity INT64>>

When the same query is run with products not set to an empty array, it works correctly. For example:

MERGE `project.dataset.table` T
USING (
  WITH container_data AS (
    SELECT
        'yongkang_210222' AS container_ref
      , DATE('2021-02-22') AS receive_date
      , 51695 AS vendor_id
      , 0.5 AS max_fill_ratio
      , [STRUCT('01757' AS code, 10 AS quantity),STRUCT('03831' AS code, 20 AS quantity)] AS products
  )
  SELECT * FROM container_data
) S
ON T.container_ref = S.container_ref
WHEN MATCHED THEN
  UPDATE SET T.products = S.products, T.max_fill_ratio = S.max_fill_ratio
WHEN NOT MATCHED THEN
  INSERT (container_ref, receive_date, vendor_id, products, max_fill_ratio)
  VALUES(container_ref, receive_date, vendor_id, products, max_fill_ratio)
WHEN NOT MATCHED BY SOURCE AND receive_date = '2021-02-22' THEN
  DELETE

I can not figure out why the empty array in the MERGE query is causing this error. I have tried inserting and updating the table with an empty array for products, and it works fine. For example, this query works correctly:

INSERT INTO `project.dataset.table`
(container_ref, receive_date, vendor_id, max_fill_ratio, products)
VALUES (
  'yongkang_210222'
  , DATE('2021-02-22')
  , 51695
  , 0.5
  , []
)

This is starting to feel like a BigQuery bug. Does anybody have any idea why this is happening?

Upvotes: 0

Views: 475

Answers (1)

flyingL123
flyingL123

Reputation: 8096

I don't know why, but adding ARRAY<STRUCT<code STRING, quantity INT64>> before the array, has fixed the issue:

MERGE `project.dataset.table` T
USING (
  WITH container_data AS (
    SELECT
        'yongkang_210222' AS container_ref
      , DATE('2021-02-22') AS receive_date
      , 51695 AS vendor_id
      , 0.5 AS max_fill_ratio
      , ARRAY<STRUCT<code STRING, quantity INT64>> [] AS products
  )
  SELECT * FROM container_data
) S
ON T.container_ref = S.container_ref
WHEN MATCHED THEN
  UPDATE SET T.products = S.products, T.max_fill_ratio = S.max_fill_ratio
WHEN NOT MATCHED THEN
  INSERT (container_ref, receive_date, vendor_id, products, max_fill_ratio)
  VALUES(container_ref, receive_date, vendor_id, products, max_fill_ratio)
WHEN NOT MATCHED BY SOURCE AND receive_date = '2021-02-22' THEN
  DELETE

Now it works correctly for both empty and non-empty arrays. I would love to understand why this is necessary if anybody can offer an explanation.

Upvotes: 2

Related Questions