trillion
trillion

Reputation: 1401

Comparison of Columns between two tables with different filters

I have the following code that I am using to compare two columns from different tables, the code almost runs perfectly if I don't have any where clauses/filters in my code below.

If I add the where clause I do get extra rows that I don't want to see.

with source1 as (
select 
b.id, 
b.qty, 
a.price 
from <table> as a
,unnest <details> as b
 where b.status != 'canceled'
),

source2 as (
select id_, qty_, price_  from <table2>
where city != 'delhi'
) 

select *
from source1 s1
full outer join source2 s2
on id = id_
where format('%t', s1) != format('%t', s2)

Below is the sample data:

s1:

id  qty   price   status
1   100 (null)    canceled
2   0     100       done
3   0      80       canceled
4   50     90       done
5   20    100       done
6   20    100       done
7   80     80       done
8   100   100       canceled
9   40     0        done
10  11     22       done
11  40     40       done
12  null   90       done

s2:

id_ qty_    price_  city_
1   100     200     ny
2   0       100     ny
3   0        80     ny
4   50       80     ny
5   40      100     ny
6   40       40     ny
7   200     200  delhi
8   100     100  delhi
9   40      100     ny
10  11       22  delhi
12  11       11     ny
13  90       80     NY

Expected Results:

id  qty    price    status       id_    qty_    price_  city_
4   50        90    done          4       50    80      ny
5   20       100    done          5       40    100     ny
6   20       100    done          6       40    40      ny
9   40         0    done          9       40    100     ny
11  40        40    done       null     null    null    null
12  null      90    done         12       11    11      ny
null null   null    null         13       90    80      ny

Current issues:

where status != 'canceled' -- *this will exclude all the rows with cases = canceled from source 1 but my source2 will still show the rows where the status is actually canceled, and will bring that row up which I don't want

similarly, source2 has a condition: where city != 'delhi' and source1 not, this will again show rows which i don't want

If I pass the column status and city in the select statements of my codes above (code 1 and code 2) it will be passed in the condition: where format('%t', s1) != format('%t', s2) and so each time there will be a mismatch since city column doesn't exist in source1 and status one in the source2. The string/serial generated will not be able to match for example:

s1:

id, qty, price, status
1   10   100   cancelled

s2:

id_ qty_ price_  city
1    10   100    Delhi

where format('%t', s1) != format('%t', s2) would generate:

(1,10,100,cancelled) != (1,10,100,delhi)

in this case, the columns have the same values (qty, price, status) but the row will still show up which I don't want due to the issue explained above.

Questions:

Upvotes: 0

Views: 167

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

Is there a way I can pass specific columns to the part format('%t',s2), instead of passing the entire table name, that should solve the problem ? If I can somehow not let the status and city be part of serialization

just use below

where format('%t', (id, qty, price)) != format('%t', (id_, qty_, price_))

Upvotes: 1

Related Questions