AI52487963
AI52487963

Reputation: 1273

Joining on a coalesce

I think I'm close to a solution but not quite there yet. I have two tables that I want to join on a given ID if it exists, but to defer to a different column as the join condition if that ID doesn't exist.

Something like:

T1:
session, ID, path
1001, 1, homepage
1001, NULL, about


T2:
ID, path, type
1, homepage, A
2, about, Z

With the desired result of:

session, ID, path, type
1001, 1, homepage, A
1001, 2, about, Z

I've tried this with:

select * from t1 inner join t2 
on (t1.id = t2.id) or (t1.path = t2.path)

But this produces some unexpected duplicates. I want to do something like

select * from t1 inner join t2
on coalesce(t1.id, t1.path) = t2.id

But this won't work right since the path won't map to the ID in the other table.

Any thoughts or suggestions?

Upvotes: 0

Views: 4129

Answers (3)

Bart Hofland
Bart Hofland

Reputation: 3905

The best I can come up with is:

select
    t1.session,
    coalesce(t1.ID, t2.ID) as ID,
    t2.path,
    t2.type
from
    t1
    inner join t2 on
        (t1.ID = t2.ID) or
        (t1.ID is null and t1.path = t2.path)

But this might still give undesired duplicates. I cannot judge your data design, but if the result gives duplicates, the problem might be found in problematic (duplicate and/or ambiguous) data within your tables t1 and/or t2.

It also immediately striked me as an issue that field ID clearly can be NULL in table t1. From my point of view, that would mean that the relation between tables t1 and t2 could be considered "optional". If not, it would be better to define field ID in table t1 as a required value.

Upvotes: 0

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172964

Below example is for BigQuery Standard SQL

#standardSQL
WITH `project.dataset.table1` AS (
  SELECT 1001 session, 1 id, 'homepage' path UNION ALL
  SELECT 1001, NULL, 'about'
), `project.dataset.table2` AS (
  SELECT 1 id, 'homepage' path, 'A' type UNION ALL
  SELECT 2, 'about', 'Z' 
)
SELECT 
  session,
  IFNULL(t1.id, t2.id) id, 
  IFNULL(t1.path, t2.path) path, 
  type
FROM `project.dataset.table1` t1
JOIN `project.dataset.table2` t2
ON (t1.id = t2.id) OR (t1.path = t2.path)

with output

Row session id  path        type     
1   1001    1   homepage    A    
2   1001    2   about       Z      

I realized you might want to join based on path column ONLY if one or both id is/are NULL. In this case your ON clause should be as below

ON t1.id = t2.id 
OR (
  (t1.id IS NULL OR t2.id IS NULL) 
  AND t1.path = t2.path
)

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269503

You want a default value. Here is one method that uses two left joins:

select t2.*, coalesce(t1.path, t1d.path) as path
from t2 left join
     t1 
     on t1.id = t2.id left join
     t1 t1d
     on t1d.id is null;

Upvotes: 1

Related Questions