Reputation: 2491
I have a bigquery table with the following structure:
select ["apple", "of", "the", "tree"] as array_col, 1 as label
union all (select ["boy", "of", "the", "streets"] as array_col, 2 as label);
I would like, via a query, to obtain a table without certain elements in the arrays. For instance, I want to filter the elements of the array_col
array that are either of
or the
, obtaining the following table:
select ["apple", "tree"] as array_col, 1 as label
union all (select ["boy", "streets"] as array_col, 2 as label);
Is there an easy way to do this in biquery?
Thanks!
Upvotes: 0
Views: 1998
Reputation: 869
You can filter it with REGEXP
. It may help to filter multiple array columns or huge tables
WITH arrays as (
SELECT ["apple", "of", "the", "tree"] array_col, 1 label
UNION ALL (SELECT ["boy", "of", "the", "streets"] array_col, 2 label)
)
SELECT JSON_VALUE_ARRAY(REGEXP_REPLACE(TO_JSON_STRING(array_col), r'"(of|the)",?',''), '$') array_col, label
FROM arrays
Upvotes: 2
Reputation: 2491
From the docs:
with arrays as (
select ["apple", "of", "the", "tree"] as array_col, 1 as label
union all (select ["boy", "of", "the", "streets"] as array_col, 2 as label)
)
select
array(
select x
from unnest(array_col) AS x
where x not in ('of', 'the')
) as array_filter,
label
from arrays;
Upvotes: 0