Reputation: 109
I am trying to join two tables in big query, Table1 contains an ID column, and Table2 contains a column which has the same ID or multiple ID's in the form of a long string separated by commas, like "id123,id456,id678"
I can join the tables together if Table1.ID = Table2.ID but this ignores all the rows where Table1.ID is one of the multiple IDs in Table2.ID. I have looked at similar post that tell me to use wildcards like
on concat('%',Table1.ID,'%') = Table2.ID
but this does not work, because it seems to create a string that contains the '%' character and doesn't actually use it as a wildcard.
I'm using standard sql in BigQuery, any help would be appreciated
Upvotes: 0
Views: 1808
Reputation: 172993
Below example is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table1` AS (
SELECT 123 id, 'a' test UNION ALL
SELECT 456, 'b' UNION ALL
SELECT 678, 'c'
), `project.dataset.table2` AS (
SELECT 'id123,id456' id UNION ALL
SELECT 'id678'
)
SELECT t2.id, test
FROM `project.dataset.table2` t2, UNNEST(SPLIT(id)) id2
JOIN `project.dataset.table1` t1
ON CONCAT('id', CAST(t1.id AS STRING)) = id2
result is as below
Row id test
1 id123,id456 a
2 id123,id456 b
3 id678 c
Upvotes: 2
Reputation: 1269873
It is doubtful that you have values in the table that start and end with percentage signs. =
does not recognize wildcards; like
does:
on Table2.ID like concat('%', Table1.ID, '%')
As a warning. Such a construct is usually a performance killer. You would be better off trying to have columns in Table1
and Table2
that match exactly.
Upvotes: 0