Why are my SAV variable names not being exported from SAS Studio correctly?

I am using SAS Studio. I am exporting a df with 57 Observations and 1151 variables. I am using proc export with DBMS = SAV to convert the file into a SAV and move it to my library Folder. I am able to download the file to my machine and open it in SPSS, although only a small portion of the variable names transfer to SPSS.

I ran this code:

proc export data=import outfile='/home/u63943531/Imported Data' dbms=sav; run;

And got "Imported Data.sav" in my library. I then right clicked and selected "Download File." The file downloads correctly, although most of the variable names when opened in SPSS are auto generated. I want the variable names that were in the SAS df.

It is fine if it remains in my work folder.

Here is the code used to generate the dataset

%web_drop_table(WORK.IMPORT);
FILENAME REFFILE '/home/u63943531/Imported Data/CSV_Desktop_Version_2024-07-10-210526449754432448585360030.csv';
PROC IMPORT DATAFILE=REFFILE
    DBMS=CSV
    OUT=WORK.IMPORT;
    GETNAMES=YES;
RUN; 
PROC CONTENTS DATA=WORK.IMPORT; 
RUN;
%web_open_table(WORK.IMPORT);

Upvotes: 0

Views: 59

Answers (1)

Tom
Tom

Reputation: 51581

Your problem does not appear to be related to PROC EXPORT or to the creation of SPSS SAV files.

Instead you have not created the SAS dataset properly. To read a text file into a SAS dataset is very simple. Just write a data step that defines and reads in the variables. The data step can be very simple.

Use INFILE to point to the file skipping the header row, use DSD option to properly handle a delimited file and the TRUNCOVER option to handle any lines with empty last variable.

Use LENGTH (or perhaps ATTRIB with the LENGTH= option) to DEFINE the type and storage length of each variable. Define them in the order they appear in the file and the later INPUT statement will be simpler. If you want to use the header row to build the LENGTH statement then just copy and paste it into your program and remove the commas (and possible convert the header values into actual variable names) and insert the desired storage length.

Attach formats and/or informats to any variables that NEED them. Most variables will NOT need them. The main exception are DATE, TIME or DATETIME values.

Use INPUT statement to read the variables. And if you have defined the variables in the same order as they appear in the text file the INPUT statement can just use a simple positional variable list to specify the variables to be read.

Example:

 data want;
   infile reffile dsd firstobs=2 truncover ;
   length var1 8 var2 $20 ..... varlast 8 ;
   informat datevar date.;
   format datevar date9.;
   input var1 -- varlast ;
 run;

If you must use a guessing procedure to guess how to define the variables then remember that PROC IMPORT can only "see" the first 32K bytes of the header row.

So if you have really long header row instead read the data skipping the header row

 proc import dbms=csv file=reffile out=import replace ;
    getname = NO ;
    datarow = 2;
   guessingrows = max ;
 run;

You can then read the header row separately.

 data headers;
   infile reffile dsd obs=1 lrecl=1000000 ;
   varnum+1;
   input label :$256. @@;
   name = substr(label,1,32);
 run;

Which you can then use to rename the variables if you want.

Or just use another guessing tool that does not have the 32K limit on the length of the header row.

Such as this macro: https://github.com/sasutils/macros/blob/master/csv2ds.sas

filename reffile '/home/u63943531/Imported Data/CSV_Desktop_Version_2024-07-10-210526449754432448585360030.csv'
  lrecl=1000000;
%csv2ds(reffile,out=import,replace=1)

Upvotes: 1

Related Questions