Elise
Elise

Reputation: 13

How to choose row for variable names in SAS?

I want to import a text-file where the variable names are in the fourth row, and the data starts on row 7, i also want to delete the last three rows. I want the variable names to be the year (y2011 etc).

The text file looks like this:




                            ;y2011     ;y2012     ;y2013     ;y2014
                         --------  --------  --------  -------- 

age  ; 25  ;50  ;36  ;54  
wage  ; 30054  ; 34052  ; 50235  ; 143923  
gend  ;  1  ;  0  ;  0  ;  1
empl  ; 0  ; 0  ; 1  ; 1




proc import datafile="windows/user/data_set.txt"
    dbms=dlm
    out=data_set
    replace;
datarow=7
    delimiter=';';
    getnames=no;
run;

Upvotes: 0

Views: 1424

Answers (1)

Tom
Tom

Reputation: 51621

You cannot get PROC IMPORT to look for names on a row that is not the one immediately before the row where the data starts.

You can read the names separately and then use the values read to generate RENAME statement or RENAME= dataset option.

Example:

proc import file=csv dbms=csv out=names replace;
  delimiter=';';
  datarow=4;
  getnames='NO';
run;

proc transpose data=names(obs=1) out=renames ;
  var _all_;
run;

proc sql noprint;
  select catx('=',nliteral(_name_),nliteral(col1)) 
    into :renames separated by ' '
    from renames
    where upcase(_name_) ne upcase(col1)
      and col1 is not null
  ;
quit;

proc import file=csv dbms=csv out=want(rename=(&renames)) replace;
  delimiter=';';
  datarow=7;
  getnames='NO';
run;

Result:

Obs    VAR1           y2011           y2012           y2013           y2014

 1     age               25              50              36              54
 2     wage           30054           34052           50235          143923
 3     gend               1               0               0               1
 4     empl               0               0               1               1

In general there is no need to use PROC IMPORT to read a delimited file. Just write your own data step and it is much less work. For this example file you could do something like below to not only read the values but also transpose it so that AGE and WAGE are variables instead of observations.

data sideways;
  infile csv dsd dlm=';' truncover firstobs=7;
  input _name_ :$32. @;
  do year=2011 to 2014;
    input value @;
    output;
  end;
run;

proc sort;
  by year _name_;
run;

proc transpose data=sideways out=want(drop=_name_);
  by year;
  id _name_;
  var value;
run;

Result:

Obs    year    age    empl    gend     wage

 1     2011     25      0       1      30054
 2     2012     50      0       0      34052
 3     2013     36      1       0      50235
 4     2014     54      1       1     143923

Upvotes: 2

Related Questions