Malyk
Malyk

Reputation: 193

Dynamic conditional joins in BigQuery

I'm trying to write perhaps a dynamic conditional statement in BigQuery to dynamically join tables based on certain conditions. It's possible to have 1 or many conditions for a single ID. How do I dynamically 'filter' a series of ID conditions together (using Table 1, 2 and 3) to get to a set of masterProductIds in to my Final table?

Table 1 - identification_table

Table 1 - identification_table

Table 2 - filing_table

Table 2 - filing_table

Table 3 - Price

Table 3 - Price

In this situation we have 3 conditions to be evaluated for Id 1, 2 conditions for Id 2 and 1 condition for Id 3.

In a situation where we have a price condition as well, it should join to the price table and filter depending on the operator and value in table 1. As for product condition, I don't have to join on any tables, I just take the value as is in table 1.

What I'm expecting as my final result set.

Final Table

Final Table

What I've done as of now:

select masterProductId, row_number() over (partition by id ) sq
from `filing_table` p 
left join `identification_table` pc
on case when subject = 'brand' then p.brandName when subject='category' then categoryName end = pc.boundaryValue
--on p.brandName = pc.boundaryValue or p.categoryName = boundaryValue
left join `price` pp
on p.code = pp.code
where 1=1
and pc.code = 'Id 2'
--and pp.RRP < safe_cast(pc1.boundaryValue as float64)

This would evaluate everything as a whole. I can't figure how to evaluate the IDs in a set by set basis.

Upvotes: 1

Views: 1744

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172964

Below is for BigQuery Standard SQL and assumes tables to be set as below (with agreement from the OP in the comments to question)

identification_table

  SELECT 'Id 1' id, 'masterProductId' subject, '=' operator, '1007' value UNION ALL
  SELECT 'Id 1', 'brandName', '=', 'brand p' UNION ALL
  SELECT 'Id 1', 'categoryName', '=', 'category 1' UNION ALL
  SELECT 'Id 2', 'categoryName', '=', 'category 1' UNION ALL
  SELECT 'Id 2', 'price', '<', '130'  UNION ALL
  SELECT 'Id 3', 'categoryName', '=', 'category 3' 

filing_table

  SELECT 11 code, 'category 1' categoryName, 'brand p' brandName, 1001 masterProductId UNION ALL
  SELECT 22, 'category 1', 'brand z', 1002 UNION ALL
  SELECT 33, 'category 2', 'brand c', 1003 UNION ALL
  SELECT 44, 'category 2', 'brand v', 1004 UNION ALL
  SELECT 55, 'category 3', 'brand e', 1005

price

  SELECT 11 code, 3 price UNION ALL
  SELECT 22, 100 UNION ALL
  SELECT 33, 8 UNION ALL
  SELECT 44, 9 UNION ALL
  SELECT 77, 28 

So, below extracts from filing_table those masterProductId's which qualify based on all criteria from identification_table

EXECUTE IMMEDIATE '''
SELECT masterProductId
FROM (
  SELECT f.*, price
  FROM `filing_table` f
  LEFT JOIN `price` p
  USING(code)
)
WHERE ''' || (  
  SELECT STRING_AGG('(' || condition || ')', ' OR ')
  FROM (
    SELECT STRING_AGG(FORMAT('(%s %s %s)', subject, operator, value), ' AND ') condition
    FROM `identification_table`,
    UNNEST([IF(subject IN ('price', 'masterProductId'), value, '"' || value || '"')]) value
    GROUP BY id
  ));

If to apply to sample data as in top of the answer - output is

Row masterProductId  
1   1001     
2   1002     
3   1005     

Upvotes: 1

Related Questions