Reputation: 5
Suppose you import some file, and have the variables
name room# test1 test2 test3
and n observations.
The name is a name, the room number is a room number of length 3 and test1 - test3 are scored from 0 - 100.
name room# test1 test2 test3
bob 123 90 40 33
bob2 123 a 40 90
bob3 123 88 k 78
..
Now, suppose I want to find all the instances of the letter k
and replace them with a numeric value zero. How can this be done? I am doing this during the initial data step by using if then statement, i.e.
data temp;
infile = "...." ;
if test1 = k then test1=0;
if test2 = k then test2=0;
if test3 = k then test3=0;
run;
Is there a better way of doing this ?
Upvotes: 0
Views: 50
Reputation: 51621
If you are reading from a text file then use the MISSING
statement to let SAS know which letters you want it to allow you to read as meaning special missing values. So the letter A
will become special missing value .A
.
missing a k ;
data want;
input name $ room $ test1 test2 test3;
datalines;
bob 123 90 40 33
bob2 123 a 40 90
bob3 123 88 k 78
;;;;
Now if you want to convert your missing values to zero there are multiple postings on line for how to do that.
Upvotes: 0
Reputation: 63434
You can use informat
to do this. I assign .A
to A
values, which is a special missing value (it will be treated as missing, but displayed as "A" not "."). Informat is how you tell SAS what value a text string should have when read into a numeric field.
If you wanted to keep A
and K
both as "missing", by the way, you wouldn't need an informat; just missing a k;
statement which tells SAS to treat those two characters as their special missing value when encountering in a normal numeric read-in. But here you need the informat to treat them properly since you want K
to be 0.
I would be tempted to suggest reading these both in as their special missing anyway, by the way, and treat the K
like 0 later on when you're computing with it - converting to 0
right away loses information.
proc format;
invalue gradei
0-100 = [3.]
'a' = .A
'k' = 0
other = .
;
quit;
data want;
informat test1-test3 gradei.;
input name $ room $ test1 test2 test3;
datalines;
bob 123 90 40 33
bob2 123 a 40 90
bob3 123 88 k 78
;;;;
run;
Upvotes: 2