Reputation: 675
I have a table in SAS and I'm trying to find the maximum and the second maximum value of a given column.
For example:
Id Column
1 100
2 50000
3 50
4 4000
5 97
6 7000
My code need to find this:
Id Column
2 50000
6 7000
Is there any way to do that in proc sql
(or even in SAS language)?
Upvotes: 0
Views: 1197
Reputation: 63424
Here's a different proc sql
option - this one selects the two highest distinct values, and all IDs with those values. It's a bit convoluted, and only works because SAS is nice about handling summary statistics.
proc sql;
select age, name
from sashelp.class
where age =
(select max(age) from sashelp.class)
or age =
(select max(age) from
(select case when age=max(age) then . else age end as age from sashelp.class)
)
;
quit;
Upvotes: 0
Reputation: 7602
Just for completeness, here's a proc sql
solution. Not necessarily better than any of the others and doesn't cope with ties.
data have;
input Id Column;
datalines;
1 100
2 50000
3 50
4 4000
5 97
6 7000
;
run;
proc sql outobs=2;
create table want as
select * from have
order by column desc;
quit;
And finally, a solution using proc rank
that does include ties
data have;
input Id Column;
datalines;
1 100
2 50000
3 50000
4 4000
5 97
6 7000
;
run;
proc rank data=have out=want_ties (where=(column_rank<=2)) descending ties=dense;
var column;
ranks column_rank;
run;
Upvotes: 0
Reputation: 63424
Another option: PROC MEANS
using IDGROUP
. This gets you a horizontal/wide dataset, but you can transpose it however you want using PROC TRANSPOSE
or the data step. See the paper Transposing Data Using PROC SUMMARY'S IDGROUP Option for more details.
proc means data=sashelp.class;
var height;
output out=classout idgroup(max(height) out[2] (height name)=height name) /autoname;
run;
Upvotes: 0
Reputation: 63424
PROC UNIVARIATE
will do this for you, as is explained in the documentation for the proc:
proc univariate data=sashelp.class;
var height;
id name;
run;
Look at the Extreme Observations
table.
If you want it in a dataset, you just use the extremeobs
output object:
ods output extremeobs=extremes;
proc univariate data=sashelp.class;
var height;
id name;
run;
ods output close;
Then filter to the high
variable and the 4th and 5th observation. This has the same concern with ties of almost any other solution.
Upvotes: 0
Reputation: 12465
Assuming no ties, sort the table in descending order and then take the top 2 values.
proc sort data=have out=temp;
by descending column;
run;
data want;
set temp(obs=2);
run;
If you have ties and only want the distinct values try the nodupkey
option on PROC SORT:
proc sort data=have out=temp nodupkey;
by descending column;
run;
data want;
set temp(obs=2);
run;
Upvotes: 1