Mark Romano
Mark Romano

Reputation: 711

SAS Function to calculate percentage for row for two stratifications

I have a dataset that looks like this

data test;
    input id1$ id2$ score1 score2 score3 total;
    datalines;
A   D   9   36  6   51
A   D   9   8   6   23
A   E   5   3   2   10
B   D   5   3   3   11
B   E   7   4   7   18
B   E   5   3   3   11
C   D   8   7   9   24
C   E   8   52  6   66
C   D   4   5   3   12 
  ;
 run;

I want to add a column that calculates what percentage of the corresponding total is of the summation within id1 and id2.

What I mean is this; id1 has a value of A. Within the value of A, there are twoid2 values ; D and E. There are two values of D, and one of E. The two total values of D are 51 and 23, and they sum to 74. The one total value of E is 10, and it sums to 10. The column I'd like to create would hold the values of .68 (51/74), .31 (23/74), and 1 (10/10) in row 1 ,row 2, and row 3 respectively.

I need to perform this calculations for the rest of the id1 and their corresponding id2. So when complete, I want a table that would like like this:

id1 id2 score1  score2  score3  total   percent_of_total
A   D   9       36      6       51      0.689189189
A   D   9       8       6       23      0.310810811
A   E   5       3       2       10      1
B   D   5       3       3       11      1
B   E   7       4       7       18      0.620689655
B   E   5       3       3       11      0.379310345
C   D   8       7       9       24      0.666666667
C   E   8       52      6       66      1
C   D   4       5       3       12      0.333333333

I realize a loop might be able to solve the problem I've given, but I'm dealing with EIGHT levels of stratification, with as many as 98 sublevels within those levels. A loop is not practical. I'm thinking something along the lines of PROC SUMMARY but I'm not too familiar with the function.

Thank you.

Upvotes: 0

Views: 601

Answers (1)

Tom
Tom

Reputation: 51621

It is easy to do with a data step. Make sure the records are sorted. You can find the grand total for the ID1*ID2 combination and then use it to calculate the percentage.

proc sort data=test;
  by id1 id2;
run;

data want ;
  do until (last.id2);
    set test ;
    by id1 id2 ;
    grand = sum(grand,total);
  end;
  do until (last.id2);
    set test ;
    by id1 id2 ;
    precent_of_total = total/grand ;
    output;
  end;
run;

Upvotes: 1

Related Questions