Reputation: 55
I am using the below code to import multiple .XLSX files from a folder at once. It works great, except I am not able to distinguish between each file once it has been imported and merged. Each Excel file is data for a specific year and I need to know the year for my analysis.
Is there any way to create a variable on each import that is equal to the file name so I can distinguish between the files once they are merged?
I have done this the long way, by importing each file individually and creating a variable in each data set before the merge, but I looking for a quicker way to accomplish this.
filename indata pipe 'dir X:\"Projects"\"Source Data" /b ';
/* put all the .xlsx file names in dataset file_list */
data file_list;
length fname $200 in_name out_name $200;
infile indata truncover;
input fname $ 200.;
in_name=translate(scan(fname,1,'.'),'_','-');
out_name=cats('_',in_name);
run;
data _null_;
set file_list end=last;
call symputx(cats('dsn',_n_),in_name);
call symputx(cats('outdsn',_n_),out_name);
if last then call symputx('n',_n_);
run;
%macro test;
%do i=1 %to &n;
PROC IMPORT OUT= work.&&outdsn&i.
DATAFILE= "X:\Projects\Source Data\&&dsn&i...xlsx"
DBMS=XLSX REPLACE;
Range="FundingByYear$A3:J5249";
GETNAMES=YES;
MIXED=NO;
RUN;
%end;
%mend;
%test
data final;
set _:;
run;
Upvotes: 0
Views: 281
Reputation: 51611
You can get the name of the source DATASET using the INDSNAME option on the SET statement.
data final;
length dsname temp_dsname $41 ;
set _: indsname=temp_dsname;
dsname=temp_dsname;
run;
You could then use that to merge with your FILE_LIST dataset if your wanted information on the source file.
Upvotes: 1