Reputation: 1764
I am fairly inexperienced with SQL, but am working to try to condense my code into one query so that it is more efficient. Below is a simplified example of a much more complex problem I have. I am having problems with the syntax of creating the summary groups and variables. In my case, the data are housed in several different table, but the joins are not a problem for me so I have only created one table here.
This is the data I have:
Name Class Wk Score ExCred X
Joe A 1 35 ? 3
Hal A 1 50 5 4
Sal A 1 45 ? 3
Kim B 1 30 5 6
Cal B 1 40 ? 6
Joe A 2 50 ? 2
Hal A 2 40 ? 3
Sal A 2 40 ? 4
Kim B 2 40 5 5
Cal B 2 40 ? 4
The table I am trying to create will look like this:
Class Wk Avg_Score Sum_X
A 1 45 10
B 1 37.5 12
A 2 43.3 9
B 2 42.5 9
So, the data are summarized by class and week. The avg_score is the average of the sum and 'score' and 'ExCred' for each student. Sum_X is simply the sum of X for each class.
I have had success with this in SAS SQL by using multiple proc means statements, but this is clunky and seems to take a really long time. There has to be a more elegant way to do this. I know it probably involves the group by statement..... Help?
Thanks. Pyll
Upvotes: 1
Views: 8872
Reputation: 63424
I see no particular reason not to use proc means
here. It should be significantly faster than proc sql
on datasets of substantial size.
proc means data=have;
class class wk;
types class*wk;
var score x;
output out=want mean(score)= sum(x)=;
run;
Just preprocess the data to include ExCred into the Score variable; if execution time is an issue use a view to do so.
If you did want to do it in sql, you would indeed use a group by.
proc sql;
create table want as
select class, wk, mean(score+ex_cred), sum(x)
from have
group by class, wk;
quit;
Upvotes: 2