Colleen
Colleen

Reputation: 13

Add 2 values within same column SAS

I have a dataset with results for over 120 different rows. There are many columns and rows, but I just want to add together the values of 2 rows within the same column.

Here is an example of what some of the most important columns look like and the output I'm trying to achieve (**edited after receiving 2 answers that didn't do exactly what I needed and to more accurately reflect what I'm looking for):

data All; 
input Sample$ Matrix$ Var$ Result Units$;
datalines; 
S01 Blood A 10 ng/mL
S01 Blood B 12 ng/mL
S01 Blood C 14 ng/mL
S01 Blood D 4  ng/mL
S01 Urine D 6  ng/mL
; 
run;

I want the output to look like:

Sample  Matrix Var  Result Units
S01     Blood  A+C   24     ng/mL
S01     Blood    B   12     ng/mL
S01     Blood    D    4     ng/mL
S01     Urine    D    6     ng/mL

As mentioned, there are up to 130 observations and more than 10 variables, so I don't want to transpose the whole dataset (something we've done for other manipulations of this dataset). I've seen mention of using the retain statement, but I don't know if that will work for this case.

Upvotes: 1

Views: 678

Answers (2)

Tom
Tom

Reputation: 51611

Just use a format that collapses A and B into A+B.

proc format ;
  value $collapse
    'A','B' ='A+B'
  ;
run;

proc summary data=all nway ;
  class var;
  format var $collapse.;
  var result1-result3;
  output out=want sum= ;
run;

Result

Obs    Var    _TYPE_    _FREQ_    Result1    Result2    Result3

 1     A+B       1         2         22         60        110
 2     C         1         1         14         16        200
 3     D         1         1          4        120        140

For your updated question just use PROC SUMMARY only on the observations you want to SUM and then combine it back with the original data (without those observations).

data All; 
  input Sample $ Matrix $ Var $ Result Units $;
datalines; 
S01 Blood A 10 ng/mL
S01 Blood B 12 ng/mL
S01 Blood C 14 ng/mL
S01 Blood D 4  ng/mL
S01 Urine D 6  ng/mL
; 

proc summary data=all ;
  where var in ('A' 'B');
  var result ;
  output out=ab(drop=_type_ _freq_) sum= 
         idgroup(out[1] (Sample Matrix Units)=)
  ;
run;


data want;
   set all(where=(not (var in ('A' 'B'))))
       ab(in=in1);
   ;
   if in1 then Var='A+B';
run;

Results

Obs    Sample    Matrix    Var    Result    Units

 1      S01      Blood     C        14      ng/mL
 2      S01      Blood     D         4      ng/mL
 3      S01      Urine     D         6      ng/mL
 4      S01      Blood     A+B      22      ng/mL

If you have more variables you want summed add them to the VAR statement. If you have more variables you want copied then add them into the IDGROUP clause of the OUTPUT statement.

In your real data probably some of those variables, like SAMPLE and MATRIX, should be BY variables (or additional CLASS variables) in the SUMMARY step.

If you have multiple sets of values that need to be collapsed you could still use the format idea, only adjust it so it can also be used to help with the WHERE clause.

proc format ;
  value $collapse
    'A','B' = 'A+B'
    'E','F' = 'E+F'
    other = ' '
  ;
run;   

proc summary data=all nway;
  where put(var,$collapse.) ne ' ';
  by Sample Matrix;
  class var ;
  format var $collapse.;
  var result ;
  output out=ab(drop=_type_ _freq_) sum= 
         idgroup(out[1] (Units)=)
  ;
run;

data want;
   set all(where=(put(var,$collapse.) eq ' '))
       ab(in=in1);
   ;
   by Sample Matrix;
   if in1 then Var=put(var,$collapse.);
   format var ;
run;

Upvotes: 1

Stu Sztukowski
Stu Sztukowski

Reputation: 12909

It can be done with a data step using retain. SAS is inherently a looping language when the set statement is used and will reset the value of variables at each row, then populate them with values it reads in. retain prevents a variable from being reset so it is carried forward until it is changed again.

On row C, we'll rename var to A+C then sum the result and the retained (stored) value from A.

data want;
    set all;

    retain a_result;

    /* Store the value of 'result' when var = 'A' */
    if(var = 'A') then a_result = result;

    /* Do not continue or output if var = 'A'. That is,
       if var is not equal to 'A', then run the code below
       and output. */
    if(var NE 'A');

    /* On row C, sum the result + the stored result when var = 'A'*/
    if(var = 'C') then do;
        var = 'A+C'; 
        result = result + a_result;
    end;

    drop i a_result;
run;

proc sort data=want;
    by sample var matrix;
run;
Sample  Matrix  Var     Result  Units
S01     Blood   A+C     24      ng/mL
S01     Blood   B       12      ng/mL
S01     Blood   D       4       ng/mL
S01     Urine   D       6       ng/mL

Upvotes: 0

Related Questions