Reputation: 1273
I think I'm close to a solution but not quite there yet. I have two tables that I want to join on a given ID if it exists, but to defer to a different column as the join condition if that ID doesn't exist.
Something like:
T1:
session, ID, path
1001, 1, homepage
1001, NULL, about
T2:
ID, path, type
1, homepage, A
2, about, Z
With the desired result of:
session, ID, path, type
1001, 1, homepage, A
1001, 2, about, Z
I've tried this with:
select * from t1 inner join t2
on (t1.id = t2.id) or (t1.path = t2.path)
But this produces some unexpected duplicates. I want to do something like
select * from t1 inner join t2
on coalesce(t1.id, t1.path) = t2.id
But this won't work right since the path won't map to the ID in the other table.
Any thoughts or suggestions?
Upvotes: 0
Views: 4129
Reputation: 3905
The best I can come up with is:
select
t1.session,
coalesce(t1.ID, t2.ID) as ID,
t2.path,
t2.type
from
t1
inner join t2 on
(t1.ID = t2.ID) or
(t1.ID is null and t1.path = t2.path)
But this might still give undesired duplicates. I cannot judge your data design, but if the result gives duplicates, the problem might be found in problematic (duplicate and/or ambiguous) data within your tables t1
and/or t2
.
It also immediately striked me as an issue that field ID
clearly can be NULL in table t1
. From my point of view, that would mean that the relation between tables t1
and t2
could be considered "optional". If not, it would be better to define field ID
in table t1
as a required value.
Upvotes: 0
Reputation: 172964
Below example is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table1` AS (
SELECT 1001 session, 1 id, 'homepage' path UNION ALL
SELECT 1001, NULL, 'about'
), `project.dataset.table2` AS (
SELECT 1 id, 'homepage' path, 'A' type UNION ALL
SELECT 2, 'about', 'Z'
)
SELECT
session,
IFNULL(t1.id, t2.id) id,
IFNULL(t1.path, t2.path) path,
type
FROM `project.dataset.table1` t1
JOIN `project.dataset.table2` t2
ON (t1.id = t2.id) OR (t1.path = t2.path)
with output
Row session id path type
1 1001 1 homepage A
2 1001 2 about Z
I realized you might want to join based on path
column ONLY if one or both id
is/are NULL. In this case your ON clause should be as below
ON t1.id = t2.id
OR (
(t1.id IS NULL OR t2.id IS NULL)
AND t1.path = t2.path
)
Upvotes: 0
Reputation: 1269503
You want a default value. Here is one method that uses two left join
s:
select t2.*, coalesce(t1.path, t1d.path) as path
from t2 left join
t1
on t1.id = t2.id left join
t1 t1d
on t1d.id is null;
Upvotes: 1