Reputation: 141
I have many excel files with differnt amounts of columns and rows. I would like to import them into SAS, but I want to delete the first row, the first column and the two last columns of all of them.
I thought doing this by using the "range"-command in the import-step. But there, I have to use a specific range (for example B2:D5).
Any ideas?
Best
Upvotes: 0
Views: 2236
Reputation: 51621
If you are using PROC IMPORT DBMS=XLSX
you can limit which part of the file it reads using the RANGE=
optional statement. But you need to specify the range in the format "sheetname$start:end", so you need to know the name of the sheet within the XLSX file that you want to read.
You can use the XLSX libname engine to get the names of the sheets in the XLSX file.
libname in xlsx "&filename" ;
proc contents data=in._all_ noprint out=contents; run;
This will also allow you to know how many columns there are.
proc sql noprint ;
select memname,max(varnum) into :sheetname trimmed,:ncols trimmed
from contents group by 1 ;
quit;
To find out how many rows there are you will need to read the data.
data _null_;
if eof then call symputx('nrows',_n_);
set in.&sheetname end=eof;
run;
With this information you could then construct the RANGE= option for you PROC IMPORT.
Note that you will need to translate the number of columns into the XLSX column notation (A,...,Z,AA,...,AZ). So if your sheet had 26 columns then last column is column Z and the to eliminate the last two you would read through column X.
Let's just assume you have translated %eval(&ncols-2)
into the appropriate column label and stored the result in the macro variable LASTCOL.
So your final PROC IMPORT now becomes:
proc import dbms=xlsx datafile='&filename" replace out=want ;
range="&sheetname$b2:&lastcol.%eval(&nrows-1)" ;
run;
Upvotes: 0
Reputation: 9109
proc import datafile='~/testxl.xlsx' out=test replace dbms=xlsx;
getnames=no;
range='MYRANGE';
run;
Something like this.
Upvotes: 0