user294448
user294448

Reputation: 5

changing all n_0 value that is a char in a data set to a N_1 value that is a numeric value for the entire data set

Suppose you import some file, and have the variables

name room# test1 test2 test3 

and n observations.

The name is a name, the room number is a room number of length 3 and test1 - test3 are scored from 0 - 100.

name room# test1 test2 test3
bob  123   90    40     33
bob2 123   a     40     90 
bob3 123   88    k      78
..    

Now, suppose I want to find all the instances of the letter k and replace them with a numeric value zero. How can this be done? I am doing this during the initial data step by using if then statement, i.e.

data temp;
   infile = "...." ; 
   if test1 = k then test1=0; 
   if test2 = k then test2=0; 
   if test3 = k then test3=0; 
run;

Is there a better way of doing this ?

Upvotes: 0

Views: 50

Answers (2)

Tom
Tom

Reputation: 51621

If you are reading from a text file then use the MISSING statement to let SAS know which letters you want it to allow you to read as meaning special missing values. So the letter A will become special missing value .A.

missing a k ;
data want;
  input name $ room $ test1 test2 test3;
datalines;
bob  123   90    40     33
bob2 123   a     40     90 
bob3 123   88    k      78
;;;;

Now if you want to convert your missing values to zero there are multiple postings on line for how to do that.

Upvotes: 0

Joe
Joe

Reputation: 63434

You can use informat to do this. I assign .A to A values, which is a special missing value (it will be treated as missing, but displayed as "A" not "."). Informat is how you tell SAS what value a text string should have when read into a numeric field.

If you wanted to keep A and K both as "missing", by the way, you wouldn't need an informat; just missing a k; statement which tells SAS to treat those two characters as their special missing value when encountering in a normal numeric read-in. But here you need the informat to treat them properly since you want K to be 0.

I would be tempted to suggest reading these both in as their special missing anyway, by the way, and treat the K like 0 later on when you're computing with it - converting to 0 right away loses information.

proc format;
invalue gradei
0-100 = [3.]
'a'   = .A
'k'   = 0
other = .
;
quit;

data want;
informat test1-test3 gradei.;
input name $ room $ test1 test2 test3;
datalines;
bob  123   90    40     33
bob2 123   a     40     90 
bob3 123   88    k      78
;;;;
run;

Upvotes: 2

Related Questions