Reputation: 45
I'd like to execute the following Standard SQL using "WHERE xxx IN ('aaa','bbb')".
But error has occurred. Is it possible to use 'IN' in STANDARD SQL?
Standard SQL
SELECT commit, author.name, committer.name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM `bigquery-public-data.github_repos.commits`
WHERE repo_name IN
('tensorflow/tensorflow', 'facebook/react')
Error
No matching signature for operator IN for argument types ARRAY<STRING> and {STRING} at [5:17]
The following Legacy SQL seems OK to execute.
Legacy SQL
SELECT commit, author.name, committer.name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM [bigquery-public-data:github_repos.commits]
WHERE repo_name IN
('tensorflow/tensorflow', 'facebook/react')
Upvotes: 2
Views: 1051
Reputation: 172993
Below is for BigQuery Standard SQL
In case if you need to preserve original records and just output those with 'tensorflow/tensorflow', 'facebook/react' in repo_name:
#standardSQL
SELECT commit, author.name AS author_name, committer.name AS committer_name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM `bigquery-public-data.github_repos.commits`
WHERE EXISTS (
SELECT 1 FROM UNNEST(repo_name) name WHERE name IN ('tensorflow/tensorflow', 'facebook/react')
)
Upvotes: 2
Reputation: 420
Field 'repo_name' is REPEATED (Legacy SQL) or ARRAY (Standard SQL) which IN operator doesn't support for Standard SQL. To convert an array into set of rows, you my use UNNEST operator.
Upvotes: 0
Reputation: 508
You can't really do like that in Standard SQL. But in this article How to use the UNNEST function in BigQuery to analyze event parameters in Analytics they explain it very well.
This is how you get the result you are looking for I think:
SELECT commit, author.name as author_name, committer.name as committer_name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM `bigquery-public-data.github_repos.commits`
CROSS JOIN UNNEST(repo_name) as repo_names_unnested
WHERE repo_names_unnested IN ('tensorflow/tensorflow', 'facebook/react')
Note that you can't have both author.name
and committer.name
, since both will then be displayed as name
. Therefore I changed them to author_name
and committer_name
.
I also think what you're actually is looking for is the result of replacing repo_name
with repo_names_unnested
, so try out replacing that as well in the SELECT-clause.
Upvotes: 3