Reputation: 27
I have a wide dataset in SAS with separate columns for each month of a year, structured: "Aug__2015" through "Aug__2016". Some of the months have two underscores and some have one. Basically, I need to create a new Month variable and a Year variable using substr and have an output that looks like the attached output. Usually, I would just use proc transpose, but we have to do it instead with several blocks of code and if-then statements. Here's a truncated version of the code I've tried for August of 2015. I get a lot of errors...
data work.jobs;
set DownL.Tabled1x2016;
Industry=propcase(Industry);
if substr(Aug__2015, length(Aug__2015),4)='2015' and not missing(Aug__2015) then do;
Year=substr(Aug__2015, length(Aug__2015), 4);
Month=substr(Aug__2015, 1, 3);
end;
run;
Somehow, for each month of the year, I need separate if-then blocks which result in character values for month and year. Thanks in advance for the guidance.
Upvotes: 0
Views: 258
Reputation: 19
EDIT: Updated based on the provided sample data.
%let months=Aug__2015*Sept__2015*Oct__2015*Nov__2015*Dec__2015*
Jan__2016*Feb__2016*Mar__2016*Apr__2016*May_2016*
June_2016*July_2016*Aug__2016;
%let fmonths=August*September*October*November*December*
January*February*March*April*May*June*July*August;
%let fyears=2015*2015*2015*2015*2015*
2016*2016*2016*2016*2016*2016*2016*2016;
DATA dset2;
SET dset1;
FORMAT month $9.;
%do i=1 %to 13;
%let var=%scan(&months., &i., *);
%let fmonth=%scan(&fmonths., &i., *);
%let fyear=%scan(&fyears., &i., *);
IF NOT MISSING(&var.) THEN DO;
jobs=&var.;
month="&fmonth.";
year=&fyear.;
OUTPUT;
END;
%end;
RUN;
@DomPazz's answer is cleaner. This one is more flexible.
Upvotes: 0
Reputation: 12465
Use and array and the vname() function.
data work.jobs;
set DownL.Tabled1x2016;
format tmp $32.;
array mths[*] Aug__2015 -- Aug__2016;
Industry=propcase(Industry);
do i=1 to dim(mths);
tmp = compress(vname(mths[i]),'_');
Month = substr(tmp,1,3);
Year = substr(tmp,4);
jobs = mths[i];
output;
end;
drop tmp i Aug__2015 -- Aug__2016;
run;
array mths[*] Aug__2015 -- Aug__2016;
This creates an array of the variables that start on the data set with Aug__2015
and ends with Aug_2016
. Alternatively you can list all the variables individually.
The *
is a wild card that lets you not define the array size -- let SAS do it for you.
A SAS Data Step array is just a logical construct of pointers to the variables in the table.
Upvotes: 0