Reputation: 502
Trying to determine a sensible way to clean dates (character), then put those dates in a proper date format via input
function, but maintain sensible variable names (and possibly even preserve the original variable names) once the char-to-number process is executed.
The dates are being cleaned with an array (replacing '..'
with '01'
, or '....'
with 0101
) since there are about 75 variables that have dates as strings.
Ex. -
data sample;
input d1 $ d2 $ d3 $ d4 $ d5 $;
cards;
200103.. 20070905 20060222 2007.... 199801..
;
run;
data clean;
set sample;
array dt_cln(5) d1-d5;
array fl_dt (5) f1-f5;
*clean out '..'/'....', replace with '01'/'0101';
do i=1 to 5;
if substr(dt_cln(i),5,4) = '....' then do;
dt_cln(i) = substr(dt_cln(i),1,4) || '0101';
end;
else if substr(dt_cln(i),7,2) = '..' then do;
dt_cln(i) = substr(dt_cln(i),1,6) || '01';
end;
end;
*change to number;
do i=1 to 5;
fl_dt(i)=input(dt_cln(i),yymmdd8.);
end;
format f: date9.;
drop i d:;
run;
What would be the best way to approach this?
Upvotes: 0
Views: 632
Reputation: 4554
data want;
set sample;
array var1 newd1-newd5;
array var2 d:;
do over var2;
var1=input(ifc(index(var2,'.')^=0,put(prxchange('s/((\.){1,})/0101/',-1,var2),8.),var2),yymmdd8.);
end;
format newd1-newd5 yymmddn8.;
drop d:;
run;
Upvotes: 0
Reputation: 9569
You cannot preserve the original names and convert from character to numeric directly - however, with a bit of macro code you could drop all the old character variables and rename the numeric versions you've created. E.g.
%macro rename_loop();
%local i;
%do i = 1 %to 5;
f&i = d&i
%end;
%mend;
Then in your data step add a rename statement at the end, after your drop statement:
rename %rename_loop;
Otherwise, your existing approach is already pretty good. You could perhaps simplify the cleaning process a bit, e.g. remove your first do-loop and do the following within the second one:
fl_dt(i)=input(tranwrd(dt_cln(i),'..','01'),yymmdd8.);
Upvotes: 1