Reputation: 79
I have an issue where some SAS code using PROC SQL isn't working as there is a case statement around one of my calculations.
The code looks like the below:
PROC SQL;
create table example as
select
a.var1,
a.var2,
a.var3,
case when a.NumericVariable is not missing then
Max(0,Sum(Sum(a.Var4), -Sum(a.Var5)))) end as Calculation
from table1 a inner join table2 b
on a.primarykey=b.primarykey
group by var1,var2,var3;
quit;
This code returns duplicated rows for the number of rows in table A. Removing the case statement as below:
PROC SQL;
create table example as
select
a.var1,
a.var2,
a.var3,
Max(0,Sum(Sum(a.Var4), -Sum(a.Var5)))) as Calculation
from table1 a inner join table2 b
on a.primarykey=b.primarykey
group by var1,var2,var3;
quit;
This just returns the group by using the levels of var1,var2 and var3.
How do I adjust the upper code to remove the duplicates?
Upvotes: 0
Views: 993
Reputation: 63424
This can be solved by performing the case
statement inside the summary. This only works if it's possible to do that, though - you might have to rework your query to make it possible.
If you can't put the case
statement inside the summarization, and the case
statement is using a variable that isn't in the group by (and can't be put in the group by), then it's not possible to do what you want here - you have to change something else, maybe preprocess the data.
One example is, if you want to only include a non-zero value if var6
is not missing, but you're okay with zero in that case, then this works:
PROC SQL;
create table example as
select
a.var1,
a.var2,
a.var3,
max(0,sum(sum(case when a.var6 is not missing then a.var4 end),-(sum(case when a.var6 is not missing then a.var5 end)))) as Calculation
from table1 a inner join table2 b
on a.primarykey=b.primarykey
group by var1,var2,var3;
quit;
Here's an example that works using sashelp.class
:
proc sql;
select max(0,sum(sum(case when age > 13 then height else 0 end),
sum(case when age > 13 then weight else 0 end))
)
from sashelp.class
group by sex;
quit;
Upvotes: 0
Reputation: 1269973
You are not using any aggregation functions, so how about just using select distinct
:
create table example as
select distinct a.var1, a.var2, a.var3,
(case when a.NumericVariable is not missing then Calculation
end) as Calculation
from table1 a inner join
table2 b
on a.primarykey = b.primarykey;
Upvotes: 1