Reputation: 156
i have a bq table i record product information. each row has a nested column called categories
and each product might have more than one category. how can i select products that are not in the category i specified
i tried following code but i still get all the product in the table, i think because each products have other categories than 'Unwanted Category'. how should i rewrite my select ?
SELECT * FROM xx.products,
unnest(categories) as categories
WHERE
categories.name not in ('Unwanted Category')
LIMIT 1000
Upvotes: 1
Views: 555
Reputation: 172974
Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.products` AS (
SELECT 1 product_id, ['Unwanted Category', 'category A'] categories UNION ALL
SELECT 2, ['category A', 'category B'] UNION ALL
SELECT 3, ['category C']
)
SELECT *
FROM `project.dataset.products`
WHERE NOT 'Unwanted Category' IN UNNEST(categories)
with result
Row product_id categories
1 2 category A
category B
2 3 category C
I just realized, you might have slightly different schema, thus modified approach - as in below example
#standardSQL
WITH `project.dataset.products` AS (
SELECT 1 product_id, [STRUCT<id INT64, name STRING>(1, 'Unwanted Category'), (2, 'category A')] categories UNION ALL
SELECT 2, [STRUCT<id INT64, name STRING>(2, 'category A'), (3, 'category B')] UNION ALL
SELECT 3, [STRUCT<id INT64, name STRING>(4, 'category C')]
)
SELECT *
FROM `project.dataset.products`
WHERE NOT 'Unwanted Category' IN (SELECT name FROM UNNEST(categories))
with output
Row product_id categories.id categories.name
1 2 2 category A
3 category B
2 3 4 category C
Upvotes: 2