Torben Gundtofte-Bruun
Torben Gundtofte-Bruun

Reputation: 2120

SQL EXCEPT across several columns

I'm trying to compare two almost identical select queries; both queries output the same columns, from the same table. Only the conditions are different on a single attribute (where foo = 'BAR' vs. where foo <> 'BAR'). The results are huge and might have only a few subtle differences. My current approach is to execute both queries and then put the results into Excel and use formulas to compare the two. Tedious and poor - there must be a better way.

How can I build a query that performs each select separately and then shows me only those lines that don't exist on both sides?

Here is a simplified version of my queries:

-- first query
select a.foo, a.bar, a.moo from abc.mytable a where a.somedetail = 'BAR'
group by a.foo, a.bar, a.moo
order by a.foo, a.bar, a.moo;

-- second query
select a.foo, a.bar, a.moo from abc.mytable a where a.somedetail <> 'BAR'
group by a.foo, a.bar, a.moo
order by a.foo, a.bar, a.moo;

Looking at this answer, it looks like I need either the EXCEPT operator (but in both directions, is that possible?), or either a UNION or a FULL OUTER JOIN but for those I don't really know how to set the WHERE clauses to match all the columns in the two sets yes having differences in the somedetail column.

This only needs to be executed once per database, but performance is very much a consideration because the data covers many millions of rows.

Upvotes: 1

Views: 94

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269445

You don't need group by, so you can write this as:

select a.foo, a.bar, a.moo
from abc.mytable a
where a.somedetail = 'BAR'
except
select a.foo, a.bar, a.moo
from abc.mytable a
where a.somedetail <> 'BAR';

Given that you are scanning all the data, you might find that group by works better:

select a.foo, a.bar, a.moo
from abc.mytable a
group by a.foo, a.bar, a.moo
having sum(case when a.somedetail = 'BAR' then 1 else 0 end) > 0 and
       sum(case when a.somedetail <> 'BAR' then 1 else 0 end) > 0;

If you have an index on (foo, bar, moo), Oracle should be smart about using it on this query.

Upvotes: 1

Related Questions