David Lee
David Lee

Reputation: 129

Cannot split/subset data after format numeric to character

In the original dataset, the gender column was coded as 1 or 2. After using a user-defined format, the gender was showed as 'female' and 'male'. I want to split the the dataset to two dataset which contains 'female' or 'male' only. The following code does not work.

data test;
input gender age;
CARDS;
1 43
2 43
1 55
2 56
;run;

proc format;
   value GENDER
      1 = 'Female'  
      2 = 'Male' ;run;

proc datasets;
modify test;
   format gender GENDER.;run;

data female male;set test;
if gender = 'Female' then output female;
if gender = 'male' then output male;run;

I will get an error as

NOTE: Character values have been converted to numeric values at the places given by:
      (Line):(Column).

But if I changed the last part to following code. It will work.

data female male;set test;
if gender = 1 then output female;
if gender = 2 then output male;run;

I know the problem is come from comparing character and numeric value. But I wonder is there anyway that I can use the 'female' and 'male' in the comparison? As in some situation it is really annoying to check every definition of user-defined format.Thanks.

Upvotes: 1

Views: 96

Answers (1)

Reeza
Reeza

Reputation: 21274

Not really. Formats apply to appearance only, so the underlying values are required. You can do a conversion to character using PUT() to use the formatted values if desired.

If Put(gender, gender.) = 'male' then output male;

Also, please note it's very rare to require your SAS datasets be split into subsets. Usually using BY is more efficient.

EDIT: Based on comments below

The correct method is to use VVALUE which will return the formatted value of a variable, but you don't need to know the user defined format name, or the variable type.

if vvalue(gender) = 'male' then output male;

Upvotes: 2

Related Questions