GabyLP
GabyLP

Reputation: 3781

SAS sum variables using name after a proc transpose

I have a table with postings by category (a number) that I transposed. I got a table with each column name as _number for example _16, _881, _853 etc. (they aren't in order).

I need to do the sum of all of them in a proc sql, but I don't want to create the variable in a data step, and I don't want to write all of the columns names either . I tried this but doesn't work:

proc sql;
select sum(_815-_16) as nnl
from craw.xxxx;
quit;

I tried going to the first number to the last and also from the number corresponding to the first place to the one corresponding to the last place. Gives me a number that it's not correct.

Any ideas?

Thanks!

Upvotes: 3

Views: 3687

Answers (2)

Robert Penridge
Robert Penridge

Reputation: 8513

I have no documentation to support this but from my experience, I believe SAS will assume that any sum() statement in SQL is the sql-aggregate statement, unless it has reason to believe otherwise.

The only way I can see for SAS to differentiate between the two is by the way arguments are passed into it. In the below example you can see that the internal sum() function has 3 arguments being passed in so SAS will treat this as the SAS sum() function (as the sql-aggregate statement only allows for a single argument). The result of the SAS function is then passed in as the single parameter to the sql-aggregate sum function:

proc sql noprint;
  create table test as 
  select sex,
         sum(sum(height,weight,0)) as sum_height_and_weight
  from sashelp.class
  group by 1
  ;
quit;

Result:

proc print data=test;
run;

              sum_height_
Obs    Sex     and_weight

 1      F        1356.3
 2      M        1728.6

Also note a trick I've used in the code by passing in 0 to the SAS function - this is an easy way to add an additional parameter without changing the intended result. Depending on your data, you may want to swap out the 0 for a null value (ie. .).

EDIT: To address the issue of unknown column names, you can create a macro variable that contains the list of column names you want to sum together:

proc sql noprint;
  select name into :varlist separated by ',' 
  from sashelp.vcolumn 
  where libname='SASHELP'
    and memname='CLASS'
    and upcase(name) like '%T'  /* MATCHES HEIGHT AND WEIGHT */
    ;
quit;

%put &varlist;

Result:

Height,Weight

Note that you would need to change the above wildcard to match your scenario - ie. matching fields that begin with an underscore, instead of fields that end with the letter T. So your final SQL statement will look something like this:

proc sql noprint;
  create table test as 
  select sex,
         sum(sum(&varlist,0)) as sum_of_fields_ending_with_t
  from sashelp.class
  group by 1
  ;
quit;

This provides an alternate approach to Joe's answer - though I believe using the view as he suggests is a cleaner way to go.

Upvotes: 1

Joe
Joe

Reputation: 63424

You can't use variable lists in SQL, so _: and var1-var6 and var1--var8 don't work.

The easiest way to do this is a data step view.

proc sort data=sashelp.class out=class;
 by sex;
run;
*Make transposed dataset with similar looking names;
proc transpose data=class out=transposed;
 by sex;
 id height;
 var height;
run;
*Make view;
data transpose_forsql/view=transpose_forsql;
 set transposed;
 sumvar = sum(of _:);  *I confirmed this does not include _N_ for some reason - not sure why!;
run;

proc sql;
 select sum(sumvar) from transpose_Forsql;
quit;

Upvotes: 2

Related Questions