Thomas Henson
Thomas Henson

Reputation: 11

Drop a variable from a SAS dataset based on a condition (IF THEN DO)

So I've done some searching around online but haven't managed to find anything that can solve this problem. Essentially, I have been given a dataset that I've then split into individual dataset's based on name.

However, if the person is a female, the age needs to be omitted from the dataset. Example output:

Males

Name Age Weight Height

Females

Name Weight Height

I have tried the following IF statement, but it just seems to drop the age variable from both the male and female tables:

if sex="F" then do;
    drop age;
end;

I'm fairly new to SAS so any help would be greatly appreciated!

Upvotes: 1

Views: 4827

Answers (3)

ishmo65
ishmo65

Reputation: 1

Not sure whether you're just asking for the Female Age (value) not to be put into field/column but below is my solution to dropping the Age field completed when any F(emale) row is present using macro variables. It's less efficient (two passes of the data) but does what's required.

options source source2 mprint mprintnest mlogicnest mlogicnest;

%let F_count=0;
proc sql;
    select count(*) as count_f
    into :F_count
    from sashelp.class
    where sex = 'F'
    ;
quit;
/* * reallocate to trim the value returned (sql returns a right-justified valued into the macro variable) */
%let F_count=&F_count.;
%put &=F_count.;

data class;
  set sashelp.class;
  %if "&F_count." ne "0" %then
    %do;
        drop age;
    %end;
run;

The logic can be tested to "keep" the Age field by changing

where sex = 'F'

to

where sex = 'X'

(changing the name of the variable F_count to X_count is not necesssary).

Upvotes: 0

Tom
Tom

Reputation: 51566

The DROP statement cannot be run conditionally. You need to conditionally generate the DROP statement (or DROP= dataset option).

To use a trivial example dataset let's start with SASHELP.CLASS and split it into individual datasets. Note that this dataset only has one observation per NAME, but I will add BY group processing to the code generation step so you can see how you could use it in the case where there are multiple observations per name.

First let's generate code for single DATA statement that creates multiple output datasets. Based on the value of the SEX variable it will conditionally add a DROP= dataset option.

filename code temp;
data _null_;
  set sashelp.class end=eof ;
  by name ;
  file code ;
  if _n_=1 then put 'data' ;
  if first.name then do;
    put '  ' name @ ;
    if sex='F' then put '(drop=age)' @ ;
    put ;
  end;
  if eof then put ';' ;
run;

Now let's append the code for the rest of the data step that will read the source dataset and output the records to the appropriate dataset.

data _null_;
  set sashelp.class end=eof ;
  by name ;
  file code mod ;
  if _n_=1 then put '  set sashelp.class; ' ;
  if first.name then put '  if name =' name $quote. 'then output ' name ';' ;
  if eof then put 'run;' ;
run;

Finally run the generated code.

%include code / source2 ;

Upvotes: 0

user667489
user667489

Reputation: 9569

When you run a data step in SAS, some statements are processed during compilation, and others subsequently during execution. In this case, the drop statement is processed before your if-then logic, so you can't use it to conditionally drop a column.

Alternatively, you could output a missing value for age for each affected row, e.g.

if sex = 'F' then call missing(age);

Or you could use a drop clause on one output dataset but not the other:

data boys girls(drop=age);
  set sashelp.class;
  if sex = 'F' then output girls;
  else if sex = 'M' then output boys;
run;

Upvotes: 5

Related Questions