user330739
user330739

Reputation: 455

SQL Distinct On Sums

I wanted to ask a quick clarification question that I couldn't find anywhere.

Consider the case:

    proc sql;
    create table tableOne as
    select distinct pol_state, year,
    sum(EPTotal) as epa,
    sum(Veh_BItotloss) as bia
    from someTable
    group by pol_state, year;

My question is does the distinct get interpreted for all the sums as well? Another way to state the question: are we selecting the unique combination of pol_state&year&epa&bia?

Thank you for your help!

Upvotes: 1

Views: 130

Answers (2)

Gerrat
Gerrat

Reputation: 29730

Here is a reference for the conceptual order or evaluation of a select statement.

In your case, the rows are grouped by pol_state, and year, then the summation happens, then distinct rows are selected. The distinct isn't doing anything in your case, since every other column is an aggregate.

So, no. You're only getting distinct pol_state and year.

Upvotes: 0

Twelfth
Twelfth

Reputation: 7190

Distinct is somewhat misc on this query...

select field from table group by field  

will return the same as

select distinct field from table.  

If your table has exact duplicates for pol_state, year,EPTotal,Veh_BItotloss and looks like this:

1,1,5,5
1,1,8,4
1,1,8,4

The query you have will return the sum(eptotal) as 21 (5+8+8). If you regard the 3rd line a duplicate of the second line and you want to return the sum 13 (5+8, ignore the second 8) then you want to do the 'distinct' logic in a subquery prior to the sum:

create table tableOne as
select distinct pol_state, year,
sum(EPTotal) as epa,
sum(Veh_BItotloss) as bia
from (select pol_state, year,EPTotal,Veh_BItotloss from table group by pol_state, year,EPTotal,Veh_BItotloss ) a
 group by pol_state, year

Upvotes: 1

Related Questions