David Masip
David Masip

Reputation: 2491

Filter elements from array in bigquery

I have a bigquery table with the following structure:

select ["apple", "of", "the", "tree"] as array_col, 1 as label
union all (select ["boy", "of", "the", "streets"] as array_col, 2 as label);

I would like, via a query, to obtain a table without certain elements in the arrays. For instance, I want to filter the elements of the array_col array that are either of or the, obtaining the following table:

select ["apple", "tree"] as array_col, 1 as label
union all (select ["boy", "streets"] as array_col, 2 as label);

Is there an easy way to do this in biquery?

Thanks!

Upvotes: 0

Views: 1998

Answers (2)

Timogavk
Timogavk

Reputation: 869

You can filter it with REGEXP. It may help to filter multiple array columns or huge tables

WITH arrays as (
    SELECT ["apple", "of", "the", "tree"] array_col, 1  label
    UNION ALL (SELECT ["boy", "of", "the", "streets"] array_col, 2 label)
)
SELECT JSON_VALUE_ARRAY(REGEXP_REPLACE(TO_JSON_STRING(array_col), r'"(of|the)",?',''), '$') array_col, label  
FROM arrays

Upvotes: 2

David Masip
David Masip

Reputation: 2491

From the docs:

with arrays as (
    select ["apple", "of", "the", "tree"] as array_col, 1 as label
    union all (select ["boy", "of", "the", "streets"] as array_col, 2 as label)
)
select
  array(
      select x
      from unnest(array_col) AS x
      where x not in ('of', 'the')
  ) as array_filter,
  label
from arrays;

Upvotes: 0

Related Questions