Pierre-Alexandre
Pierre-Alexandre

Reputation: 775

BigQuery - Get extra rows between 2 identical tables

I have two table with the exact same structure but one table has extra lines and I would like to figure out what are those lines. This is what I tried to query but seems the minus operator does not work on BQ. What should I do?

SELECT *
FROM `table1`
WHERE (date = '2021-01-31')
MINUS
SELECT *
FROM `table2`
WHERE (date = '2021-01-31')

example of result (around 15 columns). I just want to get the lines that are present in table1 but not in table2.

campaign date conversions ....
a 2021-01-31 8374474
b 2021-01-31 948383
c 2021-01-31 3948484

Upvotes: 1

Views: 581

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173046

You can use EXCEPT DISTINCT to get exactly this - see example below

select * from _temp_b
except distinct 
select * from _temp_a

If to apply it to sample data from Daniel's answer - output is obviously the same but with literally zero coding

enter image description here

Upvotes: 2

Daniel Zagales
Daniel Zagales

Reputation: 3034

Try the below. This would really depend on you being able to define a unique row identifier (or a hash of the row) and join the two tables together on that column(s) where the table with fewer rows returns a null value.

with _temp_a as (
    select 'a' campaign, '2021-01-31' date, 8374474 conversions union all
    select 'b' campaign, '2021-01-31' date, 948383 conversions
),
_temp_b as (
    select 'a' campaign, '2021-01-31' date, 8374474 conversions union all
    select 'b' campaign, '2021-01-31' date, 948383 conversions union all
    select 'c' campaign, '2021-01-31' date, 3948484 conversions
)

select b.*
from _temp_b b 
left join _temp_a a
 on b.campaign =a.campaign 
 and b.date = a.date 
where a.campaign is null

Upvotes: 2

Related Questions