Reputation: 455
I wanted to ask a quick clarification question that I couldn't find anywhere.
Consider the case:
proc sql;
create table tableOne as
select distinct pol_state, year,
sum(EPTotal) as epa,
sum(Veh_BItotloss) as bia
from someTable
group by pol_state, year;
My question is does the distinct get interpreted for all the sums as well? Another way to state the question: are we selecting the unique combination of pol_state&year&epa&bia?
Thank you for your help!
Upvotes: 1
Views: 130
Reputation: 29730
Here is a reference for the conceptual order or evaluation of a select statement.
In your case, the rows are grouped by pol_state, and year, then the summation happens, then distinct rows are selected. The distinct isn't doing anything in your case, since every other column is an aggregate.
So, no. You're only getting distinct pol_state and year.
Upvotes: 0
Reputation: 7190
Distinct is somewhat misc on this query...
select field from table group by field
will return the same as
select distinct field from table.
If your table has exact duplicates for pol_state, year,EPTotal,Veh_BItotloss and looks like this:
1,1,5,5
1,1,8,4
1,1,8,4
The query you have will return the sum(eptotal) as 21 (5+8+8). If you regard the 3rd line a duplicate of the second line and you want to return the sum 13 (5+8, ignore the second 8) then you want to do the 'distinct' logic in a subquery prior to the sum:
create table tableOne as
select distinct pol_state, year,
sum(EPTotal) as epa,
sum(Veh_BItotloss) as bia
from (select pol_state, year,EPTotal,Veh_BItotloss from table group by pol_state, year,EPTotal,Veh_BItotloss ) a
group by pol_state, year
Upvotes: 1