Hugo
Hugo

Reputation: 2444

Is there a way to filter rows in BigQuery by the contents of an array?

I have data in a BigQuery table that looks like this:

[
    { "id": 1, "labels": [{"key": "a", "value": 1}, {"key": "b", "value": 2}] },
    { "id": 2, "labels": [{"key": "a", "value": 1}, {"key": "b", "value": 3}] },
    // a lot more rows
]

My question is, how can I find all rows where "key" = "a", "value" = 1, but also "key" = "b" and "value" = 3?

I've tried various forms of using UNNEST but I haven't been able to get it right. The CROSS JOIN leaves me with one row for every object in the labels array, leaving me unable to query by both of them.

Upvotes: 4

Views: 1612

Answers (3)

Vibhor Gupta
Vibhor Gupta

Reputation: 699

You can try parse JSON and then you can apply different filter conditions on it as per your requirements, in following query I have tried to identified which all records have Key=a then I tried to identify which record has value=30 then joined them through id :-

WITH data1 AS (
SELECT '{ "id": 1, "labels": [{"key": "a", "value": 11}, {"key": "b", "value": 22}] }' as c1
union all
SELECT '{ "id": 2, "labels": [{"key": "a", "value": 10}, {"key": "b", "value": 30}] }' AS C1
) 
select T1.id, T1.C1, T1.key, T2.value from 
(SELECT JSON_EXTRACT_SCALAR(c1 , "$.id") AS id, 
json_extract_scalar(curSection, '$.value') as value, json_extract_scalar(curSection, '$.key') as key,
        c1 
 FROM data1 tbl LEFT JOIN unnest(json_extract_array(tbl.c1, '$.labels') 
             ) curSection 
where json_extract_scalar(curSection, '$.key')='a') T1,
(
SELECT JSON_EXTRACT_SCALAR(c1 , "$.id") AS id, 
json_extract_scalar(curSection, '$.value') as value, json_extract_scalar(curSection, '$.key') as key,
        c1 
 FROM data1 tbl LEFT JOIN unnest(json_extract_array(tbl.c1, '$.labels') 
             ) curSection 
where json_extract_scalar(curSection, '$.value')='30') T2
where T1.id = T2.id

Hopefully it may work for you.

Upvotes: 0

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172944

Assuming there is no duplicate entries in labels array - you can use below

select *
from `project.dataset.table` t
where 2 = (
  select count(1) 
  from t.labels kv 
  where kv in (('a', 1), ('b', 3))
)

Upvotes: 0

Sergey Geron
Sergey Geron

Reputation: 10152

Try this:

select *
from mytable
where exists (select 1 from unnest(labels) where key = "a" and value=1)
  and exists (select 1 from unnest(labels) where key = "b" and value=3)

Upvotes: 1

Related Questions