Serge Kashlik
Serge Kashlik

Reputation: 413

Convert Proc rank to SQL

I am working on a project where a model that was built in SAS has to implemented into Oracle SQL. My question is simply how can I do this in Oracle SQL:

data have;
infile datalines delimiter=',';
input var $ value;
datalines;                      
var_x,1
var_x,1
var_x,1
var_x,2
var_x,2
var_x,2
var_x,3
var_x,3
var_x,3
var_x,4
var_x,4
var_x,5
;
run;

proc rank data=have out=want groups=10 ties=high;
var value;
ranks rank;
run;

Have: enter image description here

Want: enter image description here

Upvotes: 0

Views: 611

Answers (2)

kfinity
kfinity

Reputation: 9091

This SAS blog post goes into detail about what the proc rank function does.

Specifically:

The formula for calculating group values is as follows:

FLOOR(rank*k/(n+1))

In this formula:

  • rank is the data value's rank order
  • k is the value of the GROUPS= option
  • n is the number of nonmissing values

k = 10, and n = count(*) over (), so those are easy.

We can also replicate SAS' rank behavior in Oracle, but it requires 2 levels of analytical functions, so we need an inline view.

create table have (var varchar2(10), value number);

insert into have values ('var_x',1);
insert into have values ('var_x',1);
insert into have values ('var_x',1);
insert into have values ('var_x',2);
insert into have values ('var_x',2);
insert into have values ('var_x',2);
insert into have values ('var_x',3);
insert into have values ('var_x',3);
insert into have values ('var_x',3);
insert into have values ('var_x',4);
insert into have values ('var_x',4);
insert into have values ('var_x',5);

select var, value, 
    FLOOR( max(r) over (partition by value)*10 / c ) as proc_rank_high
from ( 
  select var, value,
    rank() over (order by value, rowid) r,
    count(1) over ()+1 c
  from have
);

With the "ties=high" option, SAS calculates the rank as the max rank for that unique value, which is what I'm doing with max(r) here.

(It might be more intuitive to use row_number() over (order by value) for r instead of rank() over (order by value, rowid) - your call. They do the same thing.)

Upvotes: 3

Richard
Richard

Reputation: 27508

In SQL pass through to Oracle use Oracle RANK function. You might also need to use CEIL or FLOOR to bound the result to the range 1 .. 10

Upvotes: 0

Related Questions