B--rian
B--rian

Reputation: 5870

Quick way to determine the names of the sheets of an Excel file using SAS?

Is there a quick way to determine the names of Excel sheets in an Excel file without including the Excel-file as library? I just need an array with all the names of the sheets, not the data from the sheets itself, ideally in a form like I could mock as follows:

%let my_sheet_names = Sheet1 Sheet2 Sheet5; /* Assuming that there are no spaces in the tab sheet names */
%let my_sheet_number = 3;

Background

I am having a huge XLSX file from which I have to extract a few cells on a few sheets. So right I converted them to XLS so that the following (historical) structure works:

%do i=1 %to &my_sheet_number.;
   %let my_sheet = %scan(&my_sheet_names.,&i.);
   proc import 
        data= &full_path.;
        dbms= xls   out=work.raw_value replace;
        getnames=NO;
        range="&mysheet.$E1:E1";
   run;
   proc sql noprint;
        select * into :tmp;
        from work.raw_value;
   quit;
   /* Do something with &tmp.; */
     
%end;

I need the overall list of Excel sheets all_sheets upfront, in order to find the intersection of all_sheets with another list chosen_sheets, which then is my_sheet_names from above, containg the sheets I want to loop over. Reading in the complete XLS(X) with all sheets does not sound very practical for me, if I can extract the cell directly.

References

Related, but not solving my problem:

Upvotes: 0

Views: 1670

Answers (2)

Joe
Joe

Reputation: 63424

If you're going to use Excel files this way, you may want to consider the SAS Add-In for Microsoft Office, which is included in some SAS packages (typically the "BI" server type packages, or the Office Analytics package I think). This allows you to run SAS code from excel, which makes it much easier to work with data from individual cells.

If you don't go that route, you may want to consider a "view" excel worksheet or workbook. That would be a workbook with references to the bits you want, structured in a way SAS can more easily read it.

Imagine if you have one workbook with three sheets, SHEET1, SHEET2, and SHEET3, and you want SHEET1!A1, SHEET2!B4, and SHEET3!C3. You could then make a new sheet, SHEET4, where:

A1 = 'Q1 Data'
B1 = SHEET1!A1 
A2 = 'Q2 Data'
B2 = SHEET2!B4
A3 = 'Q3 Data'
B3 = SHEET3!C3

Then you just import SHEET4, and you have all of your data - and even if the data changes it still works.

This would be a good solution for example if you had weekly or whatever data dumps, and always wanted the same cell, but the workbooks changed. You could have a single workbook that used excel formulas to figure out what workbook was right for each week, and then populated a single worksheet with all of the data via references. Then you just import that workbook each week and you get the right data. You might need one manual intervention - refreshing the data connections in the single woorkbook; I'm not sure if SAS will force that on import or not - but it would be relatively easy to do programmatically using a scripting language and/or Excel (VBA) macros.

Upvotes: 1

Richard
Richard

Reputation: 27508

Is there a reason you can't utilize a LIBNAME MyData EXCEL "<filename>"; ?

The LIBNAME only creates a LIBREF that is a nickname pointing to a data source. The members of a library can be obtained in Proc SQL using DICTIONARY.TABLES

Example:

* create a sample workbook that has several sheets (one per name);

ods excel file='sandbox.xlsx' options(sheet_name="#BYVAL1");

options nobyline;
ods noresults; /* prevent automatic opening when destination closed */

proc print data=sashelp.class;
  by name;
run;

ods excel close;

/*
 * Determine sheet names from DICTIONARY.TABLES
 * The metadata for a sheet name is the value in MEMNAME column
 * The memnames have trailing dollar sign ($)
 * The $ should be familiar to Excel users that understand
 *   <workbook>!<sheetname>$<cell-reference> syntax
 */

libname students EXCEL 'sandbox.xlsx';

proc sql;
  reset noprint;
  select compress(memname,'$') into :SHEET_NAMES separated by ' '
  from dictionary.tables
  where libname = "STUDENTS"
  ;

libname students;

%put &=SHEET_NAMES;

Log

SHEET_NAMES=Alfred Alice Barbara Carol Henry James Jane Janet Jeffrey John Joyce Judy Louise Mary
Philip Robert Ronald Thomas William

Upvotes: 2

Related Questions