Lukiz
Lukiz

Reputation: 185

How to delete a STRUCT from an ARRAY in the NESTED field

Is there a simple way to delete a STRUCT from the nested and repeated field in the BigQuery (BQ table column Type: RECORD, Mode: REPEATED).

Let's say I have the following tables:

  1. wishlist
name    toy.id  toy.priority
Alice   1       high
        2       medium
        3       low
Kazik   3       high
        1       medium
  1. toys
id  name    available
1   car     0
2   doll    1
3   bike    1

I'd like to DELETE from wishlist toys that are not available (toys.available==0). In this case, it's toy.id==1.

As a result, the wishlist would look like this:

name    toy.id  toy.priority
Alice   2       medium
        3       low
Kazik   3       high

I know how to select it:

WITH `project.dataset.wishlist` AS
(
  SELECT 'Alice' name, [STRUCT<id INT64, priority STRING>(1, 'high'), (2, 'medium'), (3, 'low')] toy UNION ALL
  SELECT 'Kazik' name, [STRUCT<id INT64, priority STRING>(3, 'high'), (1, 'medium')]
), toys AS (
  SELECT 1 id, 'car' name, 0 available UNION ALL
  SELECT 2 id, 'doll' name, 1 available UNION ALL
  SELECT 3 id, 'bike' name, 1 available
)
SELECT wl.name, ARRAY_AGG(STRUCT(unnested_toy.id, unnested_toy.priority)) as toy
FROM `project.dataset.wishlist` wl, UNNEST (toy) as unnested_toy
LEFT JOIN toys t ON unnested_toy.id=t.id
WHERE t.available != 0
GROUP BY name

But I don't know how to remove structs <toy.id, toy.priority> from wishlist when toys.available==0.

There are very similar questions like How to delete/update nested data in bigquery or How to Delete rows from Structure in bigquery but the answers are either unclear to me in terms of deletion or suggest copying the whole wishlist to the new table using the selection statement. My 'wishlist' is huge and 'toys.availabililty' changes often. Copying it seems to me very inefficient.

Could you please suggest a solution aligned with BQ best practices?

Thank you!

Upvotes: 1

Views: 1847

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

... since row Deletion was implemented in BQ, I thought that STRUCT deletion inside a row is also possible.

You can use UPDATE DML for this (not DELETE as it is used for deletion of whole row(s), while UPDATE can be used to modify the row)

update `project.dataset.wishlist` wl
set toy = ((
  select array_agg(struct(unnested_toy.id, unnested_toy.priority)) 
  from unnest(toy) as unnested_toy
  left join `project.dataset.toys` t on unnested_toy.id=t.id
  where t.available != 0
))
where true;   

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269553

You can UNNEST() and reaggregate:

SELECT wl.name, 
       (SELECT ARRAY_AGG(t)
        FROM UNNEST(wl.toy) t JOIN
             toys
             ON toys.id = t.id
        WHERE toys.available <> 0
       ) as available_toys
FROM `project.dataset.wishlist` wl;

Upvotes: 0

Related Questions