stunt
stunt

Reputation: 49

How to sum values across a group and then assign the result to each individual within the group in SAS

I have unique groups that consist of 2 or 3 people. Each person has a value for a variable X (or missing). For each group, i want to sum the values for all observations in that group AND assign the result to each person in the group.

the have dataset looks like this

group ID X
1 110 2.5 1 111 1.0 1 112 0.0 2 210 . 2 211 5.6 3 212 2.0 3 213 1.5 4 310 . 4 311 7.2 4 312 2.1

And the dataset I want with a new variable Y loos like this:

group ID X Y 1 110 2.5 3.5 1 111 1.0 3.5 1 112 0.0 3.5 2 210 . 5.6 2 211 5.6 5.6 3 212 2.0 3.5 3 213 1.5 3.5 4 310 . 9.3 4 311 7.2 9.3 4 312 2.1 9.3

Thanks for your assistance

Upvotes: 0

Views: 403

Answers (3)

Kiran
Kiran

Reputation: 3315

if you data is already sorted Dow loop solution and mean followed by merge solution as suggested by @Richard in comments.

  /*DOW LOOP solution*/

   data want;
 do until(last.group);
 set have;
 by group;
if first.group then Y=X;
 else Y+X;
  end;
 do until(last.group);
   set have;
 by group;
output;
end;
run;

/* mean followed by merge solution*/
proc means data = have nway noprint;
class group;
var X;
output out=have1 ( keep = Group Y) sum=Y;
 run;



data want;
merge have have1;
by group;
run;

Upvotes: 1

Richard
Richard

Reputation: 27498

@Kiran, your coding of serial do … until with last.group as the terminus condition does not work because the last.group=1 state persists despite the set;by; within the second loop, and thus the second loop only iterates once.

A better solution, as long espoused by Dorfman, is the pattern of a first loop computing group size with 1 by 1, followed by a second loop explicitly looping 1 to _n_

data have; input
group ID     X; datalines;
1     110     2.5
1     111     1.0
1     112     0.0
2     210     .
2     211     5.6
3     212     2.0
3     213     1.5
4     310     .
4     311     7.2
4     312     2.1
run;

data want;
  do _n_ = 1 by 1 until (last.group);
    set have;
    by group;
    groupsum = sum(groupsum,x);
  end;

  do _n_ = 1 to _n_ ;
    set have;
    output;  * group related computation result gets carried into the output data set (want);
  end;
run;

The pattern part do _n_ = 1 to _n_ becomes more understandable when the coder realizes that SAS data step evaluates the start, stop and by values for the loop prior to actually iterating the loop.

Upvotes: 1

G. Belton
G. Belton

Reputation: 133

Assuming your table is called "ds":

PROC SQL;
    SELECT group, ID, x, sum(x) AS y
    FROM DS
    GROUP BY group;
QUIT;

Upvotes: 2

Related Questions