Memphis Meng
Memphis Meng

Reputation: 1681

Join two tables on several columns which are split from a string column

I have two tables, the first one looks like:

    col1  col2
1   data1 v1|v2|v3
2   data2 v2|v4
3   data3 NULL
4   data4 v5

And the second one looks like:

    col_k  col_v
1   k1     v1
2   k2     v2
3   k3     v3
4   k4     v4
5   k5     v5

Now I need to merge them based on ANY(table1.col2)=table2.col_v and it needs to complete on Google Bigquery. What is the workaround? I do not have a certain idea of what it will look like, it could be in one column i.e. k1|k2|k3 or in multiple columns which only contains one key. Any idea will be welcome!

Upvotes: 0

Views: 205

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173190

Below is for BigQuery Standard SQL

#standardSQL
select col1, 
  any_value(col2) col2, 
  string_agg(col_k, '|' order by col_k) col3
from `project.dataset.table1`
left join unnest(split(col2, '|')) col_v 
left join `project.dataset.table2`
using(col_v)
group by col1    

if to apply above to sample data from your question - output is

enter image description here

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1271061

I would unnest the string for the join:

select t1.*, t2.*
from table1 t1 cross join
     unnest(split(t1.col2, '|')) col join
     table2 t2
     on t2.col_v = col

Upvotes: 1

Related Questions