Yukio Takagi
Yukio Takagi

Reputation: 45

No matching signature for operator IN for argument types ARRAY<STRING>

I'd like to execute the following Standard SQL using "WHERE xxx IN ('aaa','bbb')".

But error has occurred. Is it possible to use 'IN' in STANDARD SQL?

Standard SQL

SELECT commit, author.name, committer.name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM `bigquery-public-data.github_repos.commits`
WHERE repo_name IN 
('tensorflow/tensorflow', 'facebook/react')

Error

No matching signature for operator IN for argument types ARRAY<STRING> and {STRING} at [5:17]

The following Legacy SQL seems OK to execute.

Legacy SQL

SELECT commit, author.name, committer.name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM [bigquery-public-data:github_repos.commits] 
WHERE repo_name IN 
('tensorflow/tensorflow', 'facebook/react')

Upvotes: 2

Views: 1051

Answers (3)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Below is for BigQuery Standard SQL

In case if you need to preserve original records and just output those with 'tensorflow/tensorflow', 'facebook/react' in repo_name:

#standardSQL
SELECT commit, author.name AS author_name, committer.name AS committer_name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM `bigquery-public-data.github_repos.commits`
WHERE EXISTS (
  SELECT 1 FROM UNNEST(repo_name) name WHERE name IN ('tensorflow/tensorflow', 'facebook/react') 
)

Upvotes: 2

Ronnaver
Ronnaver

Reputation: 420

Field 'repo_name' is REPEATED (Legacy SQL) or ARRAY (Standard SQL) which IN operator doesn't support for Standard SQL. To convert an array into set of rows, you my use UNNEST operator.

Upvotes: 0

Andreas Rol&#233;n
Andreas Rol&#233;n

Reputation: 508

You can't really do like that in Standard SQL. But in this article How to use the UNNEST function in BigQuery to analyze event parameters in Analytics they explain it very well.

This is how you get the result you are looking for I think:

SELECT commit, author.name as author_name, committer.name as committer_name, committer.time_sec,
committer.tz_offset, committer.date.seconds , subject, message,
repo_name
FROM `bigquery-public-data.github_repos.commits`
CROSS JOIN UNNEST(repo_name) as repo_names_unnested
WHERE repo_names_unnested IN ('tensorflow/tensorflow', 'facebook/react')

Note that you can't have both author.name and committer.name, since both will then be displayed as name. Therefore I changed them to author_name and committer_name.

I also think what you're actually is looking for is the result of replacing repo_name with repo_names_unnested, so try out replacing that as well in the SELECT-clause.

Upvotes: 3

Related Questions