nexx
nexx

Reputation: 39

BigQuery SQL join on multiple columns, but if both are unmatched, it should add a 0 row

I am trying to do a join like the following:

Given table 1:

+-----------+------+-------+
| date      | name | count |
+-----------+------+-------+
| 1.1.2020. | A    | 5     |
+-----------+------+-------+
| 1.1.2020. | B    | 2     |
+-----------+------+-------+
| 1.1.2020. | C    | 3     |
+-----------+------+-------+

and table 2:

+-----------+------+-------+
| date      | name | count |
+-----------+------+-------+
| 1.1.2020. | B    | 1     |
+-----------+------+-------+
| 1.1.2020. | C    | 4     |
+-----------+------+-------+
| 1.1.2020. | D    | 6     |
+-----------+------+-------+

I want to join them by both date and name, but if name isn't in one of the tables, the final table should still have it, with count 0.

The resulting table would be

+-----------+------+---------+---------+
| date      | name | count_1 | count_2 |
+-----------+------+---------+---------+
| 1.1.2020. | A    | 5       | 0       |
+-----------+------+---------+---------+
| 1.1.2020. | B    | 2       | 1       |
+-----------+------+---------+---------+
| 1.1.2020. | C    | 3       | 4       |
+-----------+------+---------+---------+
| 1.1.2020. | D    | 0       | 6       |
+-----------+------+---------+---------+

Whichever join type I try, it only matches only those rows for which name is in both tables, so I would only get these two rows here

+-----------+------+---------+---------+
| date      | name | count_1 | count_2 |
+-----------+------+---------+---------+
| 1.1.2020. | B    | 2       | 1       |
+-----------+------+---------+---------+
| 1.1.2020. | C    | 3       | 4       |
+-----------+------+---------+---------+

This is one of the queries I tried, but also left join, right join, etc.

select * from 
    table_1 full outer join table_2 using(date, name)

Thanks!

Upvotes: 0

Views: 2246

Answers (3)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

Consider also yet another approach - less verbose with use of PIVOT operator

select *
from (
  select *, 'count_1' col from table_1 union all 
  select *, 'count_2' from table_2 
)
pivot (sum(count) for col in ('count_1', 'count_2')) 

if applied to sample data in your question - output is

enter image description here

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

Consider below approach

select date, name, 
  sum(if(col = 'count_1', count, 0)) count_1,
  sum(if(col = 'count_2', count, 0)) count_2
from (
  select *, 'count_1' col from table_1 union all 
  select *, 'count_2' from table_2 
)
group by date, name        

if applied to sample data in your question - output is

enter image description here

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1270401

You query should work, but I would phrase it as:

select date, name, coalesce(t1.count, 0) as count_1,
       coalesce(t2.count, 0) as count_2
from table_1 t1 full outer join
     table_2 t2
     using (date, name);

This returns the additional counts as 0 rather than NULL.

Upvotes: 0

Related Questions