Reputation: 23
I am using the code below to loop through a directory and find all the .xlsx files in each subfolder. I need to adapt it to take the path to each found file and stack it into a dataset as it runs.
data thisfile;
set _NULL_;
run;
%macro drive(dir,ext);
%local filrf rc did memcnt name i;
/* Assigns a fileref to the directory and opens the directory */
%let rc=%sysfunc(filename(filrf,&dir));
%let did=%sysfunc(dopen(&filrf));
/* Make sure directory can be open */
%if &did eq 0 %then %do;
%put Directory &dir cannot be open or does not exist;
%return;
%end;
/* Loops through entire directory */
%do i = 1 %to %sysfunc(dnum(&did));
/* Retrieve name of each file */
%let name=%qsysfunc(dread(&did,&i));
/* Checks to see if the extension matches the parameter value */
/* If condition is true print the full name to the log */
%if %qupcase(%qscan(&name,-1,.)) = %upcase(&ext) %then %do;
%put &dir\&name;
data thisfile_&i.;
format filename $100.;
set thisfile;
filename = "&dir.\&name.";
run;
%end;
/* If directory name call macro again */
%else %if %qscan(&name,2,.) = %then %do;
%drive(&dir\%unquote(&name),&ext)
%end;
%end;
/* Closes the directory and clear the fileref */
%let rc=%sysfunc(dclose(&did));
%let rc=%sysfunc(filename(filrf));
%mend drive;
/* First parameter is the directory of where your files are stored. */
/* Second parameter is the extension you are looking for. */
%drive(v:\,xlsx)
The code works as intended, but I need the result of the line '%put &dir\&name;' to feed into a dataset instead of just printing to the log, so that when the code is done running I will have a dataset with one observation for each .xlsx file.
This seems straightforward enough, but when I attempt to have it create a dataset and initialize a 'filename' variable using &dir\&name, I get a dataset with zero observations. My log results indicate the macro argument is working properly but it doesn't write an observation:
MPRINT(DRIVE): data thisfile_1;
MPRINT(DRIVE): format filename $100.;
MPRINT(DRIVE): set thisfile;
MPRINT(DRIVE): filename = "v:\\MYFILENAME.xlsx";
MPRINT(DRIVE): run;
NOTE: There were 0 observations read from the data set WORK.THISFILE.
NOTE: The data set WORK.THISFILE_1 has 0 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
I know that the code as written will not 'stack' all my results into one set, but I was just trying to get it up and running.
Upvotes: 1
Views: 2465
Reputation: 51621
If you want to have the results in a data step then just call the functions in the data step.
%macro drive(dir,ext,dsn);
data &dsn ;
length dname $256 fname $256 ;
keep dname fname ;
dname=symget('dir');
length filrf $8 ;
rc=filename(filrf,dname);
did=dopen(filrf);
if did then do i=1 to dnum(did);
fname=dread(did,i);
if index(fname,'.') and upcase(scan(fname,-1,'.'))=%upcase("&ext") then output;
end;
else put 'ERROR: Directory ' dname=:$quote. 'cannot be opened.';
rc=filename(filrf);
run;
%mend drive;
%drive(dir=~/test,ext=xlsx,dsn=out)
Upvotes: 0
Reputation: 21294
Why do you have a `SET' at all. Remove that and try to see if your data is created the way you want. You can add a proc append at the end to append all the datasets or use the shortcut data reference thisfile: to append them.
data thisfile_&i.;
format filename $100.;
filename = "&dir.\&name.";
run;
Outside of macro:
data all_files;
set thisfile:;
run;
Upvotes: 1