QuantumMechanics
QuantumMechanics

Reputation: 141

Importing excel file with specific range

I have many excel files with differnt amounts of columns and rows. I would like to import them into SAS, but I want to delete the first row, the first column and the two last columns of all of them.

I thought doing this by using the "range"-command in the import-step. But there, I have to use a specific range (for example B2:D5).

Any ideas?

Best

Upvotes: 0

Views: 2236

Answers (2)

Tom
Tom

Reputation: 51621

If you are using PROC IMPORT DBMS=XLSX you can limit which part of the file it reads using the RANGE= optional statement. But you need to specify the range in the format "sheetname$start:end", so you need to know the name of the sheet within the XLSX file that you want to read.

You can use the XLSX libname engine to get the names of the sheets in the XLSX file.

libname in xlsx "&filename" ;
proc contents data=in._all_ noprint out=contents; run;

This will also allow you to know how many columns there are.

proc sql noprint ;
  select memname,max(varnum) into :sheetname trimmed,:ncols trimmed
  from contents group by 1 ;
quit;

To find out how many rows there are you will need to read the data.

data _null_;
  if eof then call symputx('nrows',_n_);
  set in.&sheetname end=eof;
run;

With this information you could then construct the RANGE= option for you PROC IMPORT.

Note that you will need to translate the number of columns into the XLSX column notation (A,...,Z,AA,...,AZ). So if your sheet had 26 columns then last column is column Z and the to eliminate the last two you would read through column X.

Let's just assume you have translated %eval(&ncols-2) into the appropriate column label and stored the result in the macro variable LASTCOL.

So your final PROC IMPORT now becomes:

proc import dbms=xlsx datafile='&filename" replace out=want ;
  range="&sheetname$b2:&lastcol.%eval(&nrows-1)" ;
run;

Upvotes: 0

data _null_
data _null_

Reputation: 9109

proc import datafile='~/testxl.xlsx' out=test replace dbms=xlsx;
   getnames=no;
   range='MYRANGE';
   run;

Something like this.

Upvotes: 0

Related Questions