Reputation: 1
I have a group of data sets where certain variables have been defined as having lengths >2000 characters. What I want to do is create a macro that identifies these variables and then creates a set of new variables to hold the values.
doing this in base code would be something like:
data new_dset;
set old_dset:
length colnam1 colnam2 colnam3 2000.;
colnam1 = substr(long_column,1,2000);
colnam2 = substr(long_column,2001,2000);
run;
I can build up the list of variable names and lengths as a set of macro variables, But I don't know how to create the new variables from the macro variables.
What I was thinking it would look like is:
%macro split;
data new_dset;
set old_dset;
%do i = 1%to &num_cols;
if &&collen&i > 2000 then do;
&&colnam&i 1 = substr(&&colnam&i,1,2000);
end;
%en;
run;
%mend;
I know that doesn't work, but that's the idea I have.
If anyone can help em work out how I can do this I would be very grateful.
Thanks
Bryan
Upvotes: 0
Views: 5089
Reputation: 63434
Your macro doesn't need to be an entire data step. In this case it's helpful to see exactly what you're replicating and then write a macro based on that.
So your code is:
data new_dset;
set old_dset:
length colnam1 colnam2 colnam3 2000.;
colnam1 = substr(long_column,1,2000);
colnam2 = substr(long_column,2001,2000);
run;
Your macro then really needs to be:
length colnam1 colnam2 colnam3 2000.;
colnam1 = substr(long_column,1,2000);
colnam2 = substr(long_column,2001,2000);
So what you can do is put that in a macro:
%macro split(colname=);
length &colname._1 &colname._2 $2000;
&colname._1 = substr(&colname.,1,2000);
&colname._2 = substr(&colname.,2001,4000);
%mend;
Then you generate a list of calls:
proc sql;
select cats('%split(colname=',name,')') into :calllist separated by ' '
from dictionary.columns
where libname = 'WORK' and memname='MYDATASET'
and length > 2000;
quit;
Then you run them:
data new_dset;
set old_dset;
&calllist;
run;
Now you're done :) &calllist contains a list of %split(colname) calls. If you may need more than 2 variables (ie, > 4000 length), you may want to add a new parameter 'length'; or if you're in 9.2 or newer you can just use SUBPAD instead of SUBSTR and generate all three variables for each outer variable.
Upvotes: 1