Reputation: 79
I want to create a new variable in Stata age_cat
based on age
. My code:
gen age_cat = age
replace age_cat = "Adult" if age>18 | "0-5 years" if age >=0 & age<=5 | "6-11 years" if age>=6 & age<=11 | "12-18 years" if age>=12 & age<=18
I'm getting a type mismatch error:
type mismatch
r(109);
How do I fix this? I know encode
will work for string to numeric replace. Is there a way to do this the other way around?
Upvotes: 1
Views: 3152
Reputation: 37208
The immediate error is simple. You copied age
to age_cat
so age_cat
is evidently a numeric variable, just like age
. You then tried to replace
it with string values. That's a type mismatch
and Stata bailed out.
Alternatively, if age
is really a string variable, then comparing it with numeric values such as 18
is also a type mismatch
.
The rest of your syntax would have failed any way. The logical operator |
(or) can be used to separate input conditions. It can never be used to separate different results.
Here is a slow way to do what you want, assuming age
is numeric:
gen age_cat = "Adult" if inrange(age, 19, .)
replace age_cat = "12-18 years" if inrange(age, 12, 18)
replace age_cat = "6-11 years" if inrange(age, 6, 11)
replace age_cat = "0-5 years" if inrange(age, 0, 5)
I've added code such that missing ages, which count as more than 18, will not get classified as Adult
.
Here's another way to do it:
gen age_cat = cond(age > 18, "Adult", cond(age >= 12, "12-18 years", cond(age >= 6, "6-11 years", "0-5 years))) if age < .
where I have made a trap for missing values explicit. cond()
is the equivalent of ifelse()
in various languages. See also this tutorial if you want more detail.
There is also recode
, which appeals to many people.
Upvotes: 1