Building a table with sub-summary-rows in SAS

Question

Assume I have two clusters of clients and for each client I have a value, how much worth the client is for me. I want to now create a summary table with rows sub-summing the all clients' values for that cluster.

Example input table clients:

client  cluster  value
1       1        3
2       2        1
3       1        4
4       1        1
5       2        5

Expected output wanted:

cluster  client   value    comment
1        1        3        kEuro
1        3        4        kEuro
1        4        1        kEuro
1        .        8        sum for cluster 1
2        2        1        kEuro
2        5        5        kEuro
2        .        6        sum for cluster 2

I am now looking for the most efficient way of achieving this. What I came up with so far is the code below, which does not fully solve it: Instead of sum for cluster x I only see sum f.

proc sql;
  create table part_1 as
  select cluster, client, value
       , "kEuro" as comment
  from clients
  where cluster = 1;
quit;

proc sql;
  create table part_2 as
  select cluster, client, value
       , "kEuro" as comment
  from clients
  where cluster = 2;
quit;

/* The summation over the clusters */
proc means noprint data=part1;
  output out=psum_1 (drop=_type_ _freq_) sum=;
run;

proc means noprint data=part2;
  output out=psum_2 (drop=_type_ _freq_) sum=;
run;

/* The concatenation stage 1 */
data puzzle_1;
  set part_1 psum_1 (in=in2);
  if in2 then do;
    client = .;
    comment = 'sum for cluster 1';
  end;
run;

data puzzle_2;
  set part_2 psum_2 (in=in2);
  if in2 then do;
    client = .;
    comment = 'sum for cluster 2';
  end;
run;

/* The concatenation stage 2 */
data wanted;
  set puzzle_1 puzzle_2;
run;

Is there a better way to approach this problem, ideally something which I could loop over, if I have more than two clusters?

Tom · Accepted Answer

You can use PROC MEANS/SUMMARY to generate the summary rows. Then use a SET with a BY to add the summary rows to the detail rows.

data have;
 input client cluster value ;
cards;
1 1 3
2 2 1
3 1 4
4 1 1
5 2 5
;

proc sort data=have; by cluster client; run;
proc means data=have noprint nway ;
  by cluster;
  var value;
  output out=sum sum= ;
run;

data want;
  set have sum ;
  by cluster;
  length comment $20 ;
  if last.cluster then comment=catx(' ','sum for cluster',cluster);
  else comment='kEuro';
run;

proc print ;
  var cluster client value comment;
run;

Building a table with sub-summary-rows in SAS

Answers (2)

Related Questions