Reputation: 375
Suppose i have data like this in my table
match_day name Goals
1 Higuain 4
1 Messi 1
1 Ozil 4
1 Villa 3
1 Xavi 4
2 Benzema 4
2 Messi 4
2 Ronaldo 3
2 Villa 4
2 Xavi 4
Now i want to find out which player scored the maximum goals in each match. I tried using it doing-
proc means data=b nway max;
class match_day name;
var goals;
output out=c(drop=_type_ _freq_) max=goals;
run;
But this does not work. What is the correct way of doing this?
Upvotes: 1
Views: 2897
Reputation: 7602
Just to clear up the PROC MEANS syntax, you could use the following code to show the top goal scorer per match_day.
proc means data=b noprint nway;
class match_day;
output out=c(drop=_:) maxid(goals(name goals))=;
run;
However, you get the issue raised by @Joe that only one record per match_day is returned, which isn't ideal in this situation where there are ties for top scorer.
If you wanted to use a procedure, then PROC RANK can do this for you.
proc rank data=b out=c (where=(goals_rank=1)) ties=low descending;
by match_day;
var goals;
ranks goals_rank;
run;
Upvotes: 2
Reputation: 63424
This isn't something you can easily do in PROC MEANS. It's much easier to do in SQL or the data step. The most direct solution:
proc sort data=b;
by match_day descending goals; *so the highest goal number is at top;
run;
data c;
set b;
by match_day;
if first.match_day; *the first record per match_day;
run;
That will give you the record with the largest number of goals. If there is a tie, you will not get more than one record, but instead arbitrarily the first.
If you want to keep all records with that number, you can do:
data c;
set b;
retain keep;
by match_day descending goals;
if first.match_day then keep=1; *the first record per match_day, flag to keep;
if keep=1 then output; *output records to be kept;
if last.goals then keep=0; *clear the keep flag at the end of the first goals set;
drop keep;
run;
Upvotes: 2