Blake
Blake

Reputation: 25

Bigquery coalesce join when key is null

Table 1

+---------+-----------+--------+
| user_id | email     | action |
+---------+-----------+--------+
| 1       | [email protected] | open   |
+---------+-----------+--------+
| 2       | null      | click  |
+---------+-----------+--------+
| 3       | [email protected] | click  |
+---------+-----------+--------+
| 4       | [email protected] | open   |
+---------+-----------+--------+

Table 2

+---------+-----------+--------+
| user_id | email     | event  |
+---------+-----------+--------+
| 1       | [email protected] | sent   |
+---------+-----------+--------+
| null    | [email protected] | none   |
+---------+-----------+--------+
| 2       | [email protected] | sent   |
+---------+-----------+--------+
| 4       | [email protected] | sent   |
+---------+-----------+--------+

I want to join based on t1.user_id = t2.user_id, but when there is a null for the key, join on t1.email = t2.email

I tried multiple ways to JOIN in bigquery as:
1.)ON COALESCE(t1.user_id,t1.email) = COALESCE(t2.user_id, t2.email)
2.)ON on case when t2.user_id is not null then t1.user_id = t2.user_id else t1.email = t2.email end

Neither work. How can this be done?

Upvotes: 1

Views: 3487

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

I would split such join to two separate:
First - join by user_id

SELECT *
FROM table1 AS t1
JOIN table2 AS t2
ON t1.user_id = t2.user_id

Second - join by email for those ids missed in first join

SELECT * 
FROM (
  SELECT * FROM table1
  WHERE user_id NOT IN (
    SELECT t1.user_id
    FROM table1 AS t1
    JOIN table2 AS t2
    ON t1.user_id = t2.user_id
  )
) t1
JOIN (
  SELECT * FROM table2
  WHERE user_id NOT IN (
    SELECT t1.user_id
    FROM table1 AS t1
    JOIN table2 AS t2
    ON t1.user_id = t2.user_id
  )
) t2
ON t1.email = t2.email

Upvotes: 1

Related Questions