Reputation: 63
I am trying to create a calculated column called percentile_Idle_Time(I am trying to calculate percentile for every value). The column is percentile value from idle_time% column.
So, the input data is
Total Time Idle Time Idle Time %
5:10:00 0:14:00 4.6%
3:09:00 0:20:00 9.49%
. . .
. . .
So, I am trying to create a new column called percentile_Idle_Time which is nothing but the percentile position of the Idle Time % value
So, the output data should be like
Total Time Idle Time Idle Time % percentile_Idle_Time
5:10:00 0:14:00 4.6% 75.4
3:09:00 0:20:00 9.49% 97.9
. . . .
. . . .
Note: The numbers are pretty rough(not accurate) I tried using
proc univariate data=WORK.QUERY_FOR_PEOPLENET_DATA_00_0000 noprint;
by DriverId;
var 'Short Idle Time %'n;
output pctlpre=P_ ;
run;
But its not working. The other challenge is to take percentile score from the % column
Upvotes: 0
Views: 1019
Reputation: 21264
Do it manually then. Sort the data ascending and use the NOBS to get the number of observations. Use n to divide by NOBS to get the total value.
proc sort data=sashelp.class out=class;
by weight;
run;
data want;
set class Nobs=myobs;
percentile = _n_ / myobs;
run;
NOTE that this does not deal with ties. If you do have ties that need to be dealt with, use PROC RANK instead. I usually do it with a group of 100 and then you'll get the 1 to 100 groups. But its' 96.5 percentile if that's what you're looking for.
proc rank data=sashelp.class out=ranked_class groups=100;
var weight;
ranks weight_percentile;
run;
EDIT: Fixed references in data step and sort to align.
Upvotes: 1