Reputation: 3
I have a file with peoples ages, and want to subset age ranges (eg. under10, 35-44 etc).
Whilst age ranges of double digit numbers works fine using grep:
X_35_44 <- X[ grep("35|36|37|38|39|40|41|42|43|44", X$Age) , ]
When trying to subset for anything under 10 eg:
X_10under <- X[ grep("0|1|2|3|4|5|6|7|8|9|10|", X$Age) , ]
I am returned any age with a 1 in it (eg. 31) or a 2 or a 3, rather than just those numbers under 10.
How do I ensure that this doesn't happen?
Any help would be much appreciated!
Thanks in advance
Upvotes: 0
Views: 119
Reputation: 1706
A solution with
ifelse()
as.integer(df$age)
df$age_cat <- ifelse(df$age < 10, "age_0-10", ifelse(10 < df$age < 20, "age_10-20", "age_20-"))
Choose your own range ...
Upvotes: 1
Reputation: 263499
Using the principle of not accepting failed code, but rather delivering a more effective coding solution, I'm going to disagree with the regex strategy and suggest you instead use cut
or findInterval
.
X <- data.frame(Ages = sample(1:85, 300, repl=TRUE))
X$age_cat <- cut(X$Age, c(0, 10, 45, 60, 75, Inf), labels=c("under10",
'10-44','45-59','60-74','75+'), right=FALSE, include.lowest=TRUE)
head(X)
#=========
Ages age_cat
1 65 60-74
2 34 10-44
3 19 10-44
4 79 75+
5 5 under10
6 51 45-59
Upvotes: 1