Hal
Hal

Reputation: 333

join if string contains substring and return the rest as new row in bigquery

Hi I would like to join table by substring in string and return not join as new row. Is that possible? Here's an example

ID price
07a 50
1b7 60
7TC 40
productCode details
newhair_07a detailA
black_1b7_chair detailB
blueBed detailC

into a table where it matches the ID from table ID with the product table's productCode. If substring is in string then join the tables together and return no match into a new row.

Output will be like so

ID productCode details price
07a newhair_07a detailA 50
1b7 black_1b7_chair detailB 60
blueBed detailC
7TC 40

I don't know where to begin to join this. Please do help

Upvotes: 1

Views: 1883

Answers (2)

Jaytiger
Jaytiger

Reputation: 12254

It looks like a FULL OUTER JOIN problem. Assuming that your productCode is a underbar separated value which includes ID, below approach would be possible:

WITH joined AS (
  SELECT * EXCEPT (code)
    FROM tableB b JOIN UNNEST(SPLIT(productCode, '_')) code 
    FULL JOIN tableA a ON a.ID = code
)
SELECT * FROM joined WHERE productCode IS NULL
 UNION ALL
SELECT * FROM joined WHERE productCode IS NOT NULL
QUALIFY ROW_NUMBER() OVER (PARTITION BY productCode, details ORDER BY ID NULLS LAST) = 1;

output:

enter image description here

sample tables:

CREATE TEMP TABLE tableA AS (
  SELECT '07a' ID, 50 price
  UNION ALL
  SELECT '1b7', 60
  UNION ALL
  SELECT '7TC', 40
);

CREATE TEMP TABLE tableB AS (
  SELECT 'newhair_07a' productCode, 'detailA' details
  UNION ALL
  SELECT 'black_1b7_chair', 'detailB'
  UNION ALL
  SELECT 'blueBed', 'detailC'
);

Upvotes: 0

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Consider below approach

with temp as (
  select id, productCode, details, price 
  from tableID
  join tableProduct
  on regexp_contains(productCode, id)
)
select * from temp 
union all
select id, null, null, price 
from tableID where not id in (select id from temp)
union all
select null, productCode, details, null  
from tableProduct where not productCode in (select productCode from temp)    

if applied to sample data in your question - output is

enter image description here

Upvotes: 1

Related Questions