Ryan
Ryan

Reputation: 91

Add within variable by year in SAS

I am working with some state-level data and want to create a collective "USA" level within that state data. I have about 10 years of data for each state and would like to keep that format for the "USA" level of the variable. Is there an easy way I can do this? I will attach some sample data below.

data test;
input State$ year count1 count2;

datalines;
MA 2010 33 34
MA 2011 54 32
MA 2012 45 43
MA 2013 49 50
LA 2010 29 29
LA 2011 32 34
LA 2012 34 44
LA 2013 44 45
WV 2010 33 34
WV 2011 34 37
WV 2012 44 43
WV 2013 33 35
;
run;

Here is what I would like the data to look like:

data test_2;
input State$ year count1 count2;

datalines;
MA 2010 33 34
MA 2011 54 32
MA 2012 45 43
MA 2013 49 50
LA 2010 29 29
LA 2011 32 34
LA 2012 34 44
LA 2013 44 45
WV 2010 33 34
WV 2011 34 37
WV 2012 44 43
WV 2013 33 35
USA 2010 95 97
USA 2011 120 103
USA 2012 121 130
USA 2013 126 130
;
run;

Upvotes: 1

Views: 86

Answers (4)

Richard
Richard

Reputation: 27508

Instead of adding rows that are for a dimensional summary, consider using Proc REPORT and its full feature set that includes using a NOTSORTED format and MULTILABEL format processing.

Example:

proc format ;
  value $geo (multilabel notsorted)
  'MA' = 'MA'
  'LA' = 'LA'
  'WV' = 'WV'
  'MA', 'LA', 'WV' = 'USA'    
  ;
run ;

ods html file='report.html' ;

proc report data=test ;
  column state year count1 count2 ;
  define state / group order=data preloadfmt format=$geo. mlf ;
  define year / group ;
run ;

ods html close ;

enter image description here

Upvotes: 2

Tom
Tom

Reputation: 51611

Since the grouping variable YEAR can only take on a small number of integer values you could easily use a temporary array to store the totals. You could even make it flexible enough to determine the range of years that appear in the data.

data have;
  input State $ year count1 count2;
datalines;
MA 2010 33 34
MA 2011 54 32
MA 2012 45 43
MA 2013 49 50
LA 2010 29 29
LA 2011 32 34
LA 2012 34 44
LA 2013 44 45
WV 2010 33 34
WV 2011 34 37
WV 2012 44 43
WV 2013 33 35
;


data want;
   set have end=eof;
   array total[1960:2099,2] _temporary_ (280*0);
   array years[2] _temporary_ (2099 1960);
   total[year,1]+count1;
   total[year,2]+count2;
   years[1]= min(years[1],year);
   years[2]= max(years[2],year);
   output;
   if eof then do;
     state='USA';
     do year=years[1] to years[2];
       count1=total[year,1];
       count2=total[year,2];
       output;
     end;
   end;
run;

Upvotes: 1

Stu Sztukowski
Stu Sztukowski

Reputation: 12909

Here are two ways to do it: with SQL and PROC MEANS.

SQL

Append the sums by year you with an Outer Union.

proc sql;
    create table want as
        select *
        from test

        OUTER UNION CORR

        select 'USA' as State
             , year   
             , sum(count1) as count1 
             , sum(count2) as count2
        from test
        group by year
    ;
quit;

PROC MEANS

Calculate the totals for State*Year and append it to the main dataset. We include State in there so we can include the original values. It will make the grand total a missing value for State but we can modify that after-the-fact.

proc means data=test noprint;
    class State year;
    types Year State*Year;
    output out=want(drop=_:)
        sum=
    ;
run;

data want;
    modify want;
    where state = ' ';
    state = 'USA';
run;

Upvotes: 3

shaun_m
shaun_m

Reputation: 2776

Using the DATA step:

proc sort data=test out=want;
    by year;
run;

* Calculate USA totals;
data want;  
    set want;
    by year;
    total1 + count1;
    total2 + count2;
    if last.year then do;
        count1 = total1;
        count2 = total2;
        state = 'USA';
        output;
        total1 = 0;
        total2 = 0;
    end;
    drop total:;
run;

* Add USA totals to the end of the data;
data want;
    set test want;
run;


                            Obs    State    year    count1    count2                            
                                                                                                
                              1     MA      2010       33        34                             
                              2     MA      2011       54        32                             
                              3     MA      2012       45        43                             
                              4     MA      2013       49        50                             
                              5     LA      2010       29        29                             
                              6     LA      2011       32        34                             
                              7     LA      2012       34        44                             
                              8     LA      2013       44        45                             
                              9     WV      2010       33        34                             
                             10     WV      2011       34        37                             
                             11     WV      2012       44        43                             
                             12     WV      2013       33        35                             
                             13     USA     2010       95        97                             
                             14     USA     2011      120       103                             
                             15     USA     2012      123       130                             
                             16     USA     2013      126       130                             

Upvotes: 2

Related Questions