Reputation: 135
I am working with a Data Set in this format:
Column 1 (What I Have), Column 2 (What I need to see)
8 1
8 1
8 1
9 2
9 2
9 2
10 3
10 3
10 3
11 4
11 4
12 5
13 6
14 7
14 7
14 7
Attempting to use Proc Rank to Generate the Second Column Code:
Proc Rank data=Have
out=Want
ties=low;
Var Column1;
ranks Column2;
run;
The Output I get looks like this:
Column1 (What I Have) Column2 (What I get)
8 1
8 1
8 1
9 199
9 199
9 199
10 415
10 415
10 415
11 613
11 613
12 823
13 1015
14 1222
14 1222
14 1222
Based on what I've read and heard the first dataset is what I expect my outcome to be.
Is there an error in the code or expectations that has lead to an output such as the one I am seeing? Additionally is there a better way to generate a sequenced rank for a column of sequential numeric data?
Upvotes: 2
Views: 450
Reputation: 135
Wanted to let you know that I figured out how to use this.
The Key is in the Ties=(Low or High or Mean or Dense) Statement.
Here is the New Code that Works.
Proc Rank data=Have
out=Want
ties=dense;
Var Column1;
ranks Column2;
run;
Now My Dataset Groups all like values and gives them the same count. Column 2 is generated with the proper values. Documentation is here:
Scroll to the bottom for features of Ties
Upvotes: 1
Reputation: 462
Alternatively with an implied retain.
DATA WANT;
SET HAVE;
BY COLUMN1;
IF _N_ = 0 THEN
COLUMN2 = 0;
IF FIRST.COLUMN1 THEN
COLUMN2 + 1;
RUN;
Upvotes: 0
Reputation: 12465
You can do this in a Data Step assuming it is OK to sorted by Column1.
proc sort data=have;
by column1;
run;
data want;
set have;
by column1;
retain column2 ;
if _n_ = 0 then
column2 = 0;
if first.column1 then
column2 = column2 + 1;
run;
Upvotes: 4