Dave
Dave

Reputation: 23

SAS: Looping Through Folders to Import and Export Multiple Files

Thanks in advance for any and all suggestions.

I am working in SAS for the first time to complete a (theoretically) simple task. I have a parent folder in a Windows directory which contains several sub-folders. The sub-folders are not systematically named. For example, if the parent folder is called "W:/Documents/ParentFolder/", then the sub-folders might be "W:/Documents/ParentFolder/ABC1D26/" and "W:/Documents/ParentFolder/HG34A/".

Each sub-folder contains several SAS datasets. In any particular sub-folder, some of the SAS datasets have the .sas7bdat extension and others have the .sd2 extension. Furthermore, no two sub-folders necessarily have the same number of datasets, and the datasets are not systematically named either.

I would like to write a program in SAS which looks inside each sub-folder, loads any .sas7bdat or .sd2 datasets it finds, and exports the dataset into a different folder as a .dta file.

There are too many SAS datasets in each sub-folder to do this task manually for each dataset, but there are not so many sub-folders that I cannot feed the sub-folder names to SAS manually. Below is a commented version of my attempt at a program which completes this task. Unfortunately, I encounter many errors, no doubt due to my inexperience with SAS.

For example, SAS gives the following errors: "ERROR: Invalid logical name;" "ERROR: Error in the FILENAME statement;" and "ERROR: Invalid DO loop control information;" among others.

Can anyone offer any advice?

%macro sas_file_converter();

/* List the sub-folders containing SAS files in the parent folder */
%let folder1 = W:\Documents\ParentFolder\ABC1D26;
%let folder2 = W:\Documents\ParentFolder\HG34A;

/* Start loop over the sub-folders. In each sub-folder, identify all the files, extract the file names, import the files, and export the files. */
%do folder_iter = 1 %to 2;

    /* Define the sub-folder that is the focus of this iteration of the loop */
    filename workingFolder "&&folder&folder_iter..";
    
    /* Extract a list of datasets in this sub-folder */
    data datasetlist;
        length Line 8 dataset_name $300;
        List = dopen('workingFolder');
        do Line = 1 to dnum(List);
            dataset_name = tranwrd(tranwrd(lowcase(trim(dread(List,Line))),".sas7bdat",""),".sd2","");
            output;
        end;
        drop List Line;
    run;

    /* Get number of datasets in this sub-folder */
    proc sql nprint;
        select count(*)
        into :datasetCount
        from WORK.datasetlist;
    quit;

    /* Loop over datasets in the sub-folder. In each iteration of the loop, load the dataset and export the dataset. */
    %do dataset_iter = 1 %to &datasetCount.;

        /* Get the name of the dataset which is the focus of this iteration */
        data _NULL_;
            set WORK.DATASETLIST (firstobs=&dataset_iter. obs=&dataset_iter.);
            call symput("inMember",strip(dataset_name));
        end;

        /* Set the libname */
        LIBNAME library '&folder&folder_iter..';

        /* Load the dataset */
        data new;
            set library.&inMember.;
        run;

        /* Export the dataset */
        proc export data=library.&inMember.
            file = "W:\Documents\OutputFolder\&inMember..dta"
            dbms = stata replace;
        run;
    %end;
%end;
%mend;

Upvotes: 1

Views: 2571

Answers (2)

Dave
Dave

Reputation: 23

Thanks very much for your helpful suggestions. I used the following program to perform this task. It is largely based on Richard's example. I'm posting it here for the benefit of future readers; Richard's example includes additional code that may help you understand what this program does.

Additional files/folders can be accommodated by adding them to the "%let folders" line. (I write many file/folder names here.)

Note that I separate the sub-folders with three dashes ("---") because some of the files and sub-sub-folders have spaces in their names. Note also that for the .sd2 files, I was able to simply replace the instances of "sas7bdat" with "sd2" and the program worked fine.

Thanks again.

%let inputfolder = W:\Documents\ParentFolder;
%let folders = ABC1D26---HG34A---Sub Folder\ZH323;
%let exportfolder = W:\Documents\ExportFolder;

data _null_;
    do findex = 1 to countw("&folders.","---");
        folder = scan("&folders", findex, "---");

        path = catx("/", "&dataroot.", folder);
        call execute ('libname user ' || quote(trim(path)) || ';');

        length fileref $8;
        call missing(fileref);
        rc = filename(fileref, path);

        did = dopen(fileref);
        do dindex = 1 to dnum(did);
            filename = lowcase(dread(did,dindex));
            if scan(filename,-1) ne 'sas7bdat' then continue;

            xptfilename = tranwrd(filename, '.sas7bdat', '.dta')
            xptfilepath = catx("\", "&exportpath", folder, xptfilename);
            
            datasetname = tranwrd(filename, '.sas7bdat', '');
       
            sascode = 'PROC EXPORT data=' || trim(datasetname) || " replace file=" || quote(trim(xptfilepath)) || " dbms=stata; run;";

            call execute (trim(sascode));
        end;
        did = dclose(did);
      
        call execute ('libname user clear;');

        rc = filename(fileref);
    end;
run;

Upvotes: 1

Richard
Richard

Reputation: 27508

You can perform all the code generation in a DATA Step and submit via CALL EXECUTE. The only part of the program that would be macro related is specifying the sas data root folder, the names of the sub-folders to search and the export path.

The program could be very similarly macro coded, but could be tougher to debug, and would require %sysfunc wrappers around the function calls.

Example:

/* Create some sample data in some example folders */

%let workpath = %sysfunc(pathname(WORK));

%let name = %sysfunc(dcreate(ABC, &workpath));
%let name = %sysfunc(dcreate(DEF, &workpath));

libname user "&workpath./ABC";
data one two three four five;
  set sashelp.class;
run;

libname user "&workpath./DEF";
data six seven eight nine ten;
  set sashelp.class;
run;

libname user clear;

/* export all data sets in folders to liked named export files */

%let dataroot = &workpath;
%let folders = ABC DEF;
%let exportpath = c:\temp;

data _null_;
  do findex = 1 to countw("&folders");
    folder = scan("&folders", findex);

    path = catx("/", "&dataroot.", folder);
    call execute ('libname user ' || quote(trim(path)) || ';');

    length fileref $8;
    call missing(fileref);
    rc = filename(fileref, path);

    did = dopen(fileref);
    do dindex = 1 to dnum(did);
      filename = dread(did,dindex);
      if scan(filename,-1) ne 'sas7bdat' then continue;

      xptfilename = tranwrd(filename, '.sas7bdat', '.dta');
      xptfilepath = catx('/', "&exportpath", xptfilename);

      datasetname = tranwrd(filename, '.sas7bdat', '');

      sascode = 'PROC EXPORT data=' || trim(datasetname)
      || " replace file=" || quote(trim(xptfilepath))
      || " dbms=stata;"
      ;

      call execute (trim(sascode));      
    end;
    did = dclose(did);

    call execute ('run; libname user clear;');

    rc = filename(fileref);
  end;
run;

Upvotes: 0

Related Questions