Akos
Akos

Reputation: 870

Performing sas statements on formatted values not true value

I have a dataset with two formatted variables, one which is a raw code (representing the name of a country) and one which is a translation of that code into a standardised code for countries. Both of these use formats which show the country name.

I want to output values where the formatted values are different, regardless of the true value. For example, if I have the following data (formats shown in square brackets):

obs    Country_raw[formatted value]        Country_std[formatted value]  
1      2211[Spain]                         3108[Spain]
2      9122[Zaire]                         9108[Democratic Republic of Congo]

For both records, neither of the true values match, but I want to output only the second record, where the formatted values do not match.

I tried

data diffs;
set countries;
format country_raw $CRAW.;
format country_std $CSACC.;
if country_raw ne country_std THEN OUTPUT;
run;

but this uses the true values.

I've also tried using proc print:

proc print data=countries;
format country_raw $CRAW.;  
format country_std $CSACC.;
where country_raw ne country_std;
run;

But this also operates on the true values.

Upvotes: 1

Views: 478

Answers (2)

Joe
Joe

Reputation: 63424

In the data step, you can use the VVALUE function, like below:

proc format;
  value countryF
  1='Spain'
  2='Zaire';
  value country2F
  4='Spain'
  6='Zaire'
  7='Morocco';
quit;

data have;
  input country_raw country_std;
  format country_raw COUNTRYF.
         country_std COUNTRY2F.;
  datalines;
1 4
2 7
;;;
run;

data want;
  set have;
  if vvalue(country_raw) = vvalue(country_std);
run;

VVALUE is not available in PROC SQL environments (including WHERE statements in procs or the data step), however, and you'd need to use PUT. VVALUE has the advantage that you don't need to know what the format name is on a particular field to see its formatted value.

Upvotes: 2

DomPazz
DomPazz

Reputation: 12465

Equality is tested on the underlying value, not the formatted value.

try

put(country_raw,$CRAW.) ^= put(country_std,$CSACC.)

as your condition.

The put() function converts the value from the stored value to the formatted value.

Upvotes: 2

Related Questions