Reputation: 185
Is there a simple way to delete a STRUCT from the nested and repeated field in the BigQuery (BQ table column Type: RECORD, Mode: REPEATED).
Let's say I have the following tables:
name toy.id toy.priority
Alice 1 high
2 medium
3 low
Kazik 3 high
1 medium
id name available
1 car 0
2 doll 1
3 bike 1
I'd like to DELETE from wishlist toys that are not available (toys.available==0). In this case, it's toy.id==1.
As a result, the wishlist would look like this:
name toy.id toy.priority
Alice 2 medium
3 low
Kazik 3 high
I know how to select it:
WITH `project.dataset.wishlist` AS
(
SELECT 'Alice' name, [STRUCT<id INT64, priority STRING>(1, 'high'), (2, 'medium'), (3, 'low')] toy UNION ALL
SELECT 'Kazik' name, [STRUCT<id INT64, priority STRING>(3, 'high'), (1, 'medium')]
), toys AS (
SELECT 1 id, 'car' name, 0 available UNION ALL
SELECT 2 id, 'doll' name, 1 available UNION ALL
SELECT 3 id, 'bike' name, 1 available
)
SELECT wl.name, ARRAY_AGG(STRUCT(unnested_toy.id, unnested_toy.priority)) as toy
FROM `project.dataset.wishlist` wl, UNNEST (toy) as unnested_toy
LEFT JOIN toys t ON unnested_toy.id=t.id
WHERE t.available != 0
GROUP BY name
But I don't know how to remove structs <toy.id, toy.priority> from wishlist when toys.available==0.
There are very similar questions like How to delete/update nested data in bigquery or How to Delete rows from Structure in bigquery but the answers are either unclear to me in terms of deletion or suggest copying the whole wishlist to the new table using the selection statement. My 'wishlist' is huge and 'toys.availabililty' changes often. Copying it seems to me very inefficient.
Could you please suggest a solution aligned with BQ best practices?
Thank you!
Upvotes: 1
Views: 1847
Reputation: 172974
... since row Deletion was implemented in BQ, I thought that STRUCT deletion inside a row is also possible.
You can use UPDATE DML for this (not DELETE as it is used for deletion of whole row(s), while UPDATE can be used to modify the row)
update `project.dataset.wishlist` wl
set toy = ((
select array_agg(struct(unnested_toy.id, unnested_toy.priority))
from unnest(toy) as unnested_toy
left join `project.dataset.toys` t on unnested_toy.id=t.id
where t.available != 0
))
where true;
Upvotes: 2
Reputation: 1269553
You can UNNEST()
and reaggregate:
SELECT wl.name,
(SELECT ARRAY_AGG(t)
FROM UNNEST(wl.toy) t JOIN
toys
ON toys.id = t.id
WHERE toys.available <> 0
) as available_toys
FROM `project.dataset.wishlist` wl;
Upvotes: 0