AvdotyaC
AvdotyaC

Reputation: 27

Converting wide data to narrow in SAS using substr

enter image description hereI have a wide dataset in SAS with separate columns for each month of a year, structured: "Aug__2015" through "Aug__2016". Some of the months have two underscores and some have one. Basically, I need to create a new Month variable and a Year variable using substr and have an output that looks like the attached output. Usually, I would just use proc transpose, but we have to do it instead with several blocks of code and if-then statements. Here's a truncated version of the code I've tried for August of 2015. I get a lot of errors...

data work.jobs;
set DownL.Tabled1x2016;
    Industry=propcase(Industry);
    if substr(Aug__2015, length(Aug__2015),4)='2015' and not missing(Aug__2015) then do;
    Year=substr(Aug__2015, length(Aug__2015), 4);
    Month=substr(Aug__2015, 1, 3);
end;
run;

Somehow, for each month of the year, I need separate if-then blocks which result in character values for month and year. Thanks in advance for the guidance.

enter image description here

Upvotes: 0

Views: 258

Answers (2)

Dave O
Dave O

Reputation: 19

EDIT: Updated based on the provided sample data.

%let months=Aug__2015*Sept__2015*Oct__2015*Nov__2015*Dec__2015*
  Jan__2016*Feb__2016*Mar__2016*Apr__2016*May_2016*
  June_2016*July_2016*Aug__2016;
%let fmonths=August*September*October*November*December*
  January*February*March*April*May*June*July*August;
%let fyears=2015*2015*2015*2015*2015*
  2016*2016*2016*2016*2016*2016*2016*2016;

DATA dset2;
  SET dset1;
  FORMAT month $9.;
  %do i=1 %to 13;
    %let var=%scan(&months., &i., *);
    %let fmonth=%scan(&fmonths., &i., *);
    %let fyear=%scan(&fyears., &i., *);
    IF NOT MISSING(&var.) THEN DO;
      jobs=&var.;
      month="&fmonth.";
      year=&fyear.;
      OUTPUT;
    END;
  %end;
RUN;

@DomPazz's answer is cleaner. This one is more flexible.

Upvotes: 0

DomPazz
DomPazz

Reputation: 12465

Use and array and the vname() function.

data work.jobs;
set DownL.Tabled1x2016;
    format tmp $32.;
    array mths[*] Aug__2015 -- Aug__2016;
    Industry=propcase(Industry);
    do i=1 to dim(mths);
        tmp = compress(vname(mths[i]),'_');
        Month = substr(tmp,1,3);
        Year = substr(tmp,4);
        jobs = mths[i];
        output;
    end;
    drop tmp i Aug__2015 -- Aug__2016;
run;

array mths[*] Aug__2015 -- Aug__2016; This creates an array of the variables that start on the data set with Aug__2015 and ends with Aug_2016. Alternatively you can list all the variables individually.

The * is a wild card that lets you not define the array size -- let SAS do it for you.

A SAS Data Step array is just a logical construct of pointers to the variables in the table.

Upvotes: 0

Related Questions