Reputation: 1031
I want to recode a variable to missing if it takes on one of three values in dplyr. Consider the following data frame have
:
id married hrs_workperwk
1 1 40
2 1 55
3 1 70
4 0 -1
5 1 99
6 0 -2
7 0 10
8 0 40
9 1 45
-1, -2, and 99 are illegal values. The new data frame want
should look like this:
id married hrs_workperwk
1 1 40
2 1 55
3 1 70
4 0 NA
5 1 NA
6 0 NA
7 0 10
8 0 40
9 1 45
I could use base R to solve this quickly, but dplyr is often convenient to work in when I'm already using mutate()
. Alas, that means I currently use multiple nested if_else()
functions:
want <- mutate(have,
hrs_workperwk = if_else(hrs_workperwk < 0, as.numeric(NA),
if_else(hrs_workperwk = 99, as.numeric(NA), hrs_workperwk)))
Is there a way to do this with only one if_else() function? Ideally something like this:
want <- mutate(have,
hrs_workperwk = if_else(hrs_workperwk = c(-2, -1, 99), as.numeric(NA), hrs_workperwk))
Upvotes: 1
Views: 4437
Reputation: 47320
In base R:
df1$hrs_workperwk[df1$hrs_workperwk %in% c(-1,-2,99)] <- NA
or
is.na(df1$hrs_workperwk) <- df1$hrs_workperwk %in% c(-1,-2,99)
output for both cases:
# id married hrs_workperwk
# 1 1 1 40
# 2 2 1 55
# 3 3 1 70
# 4 4 0 NA
# 5 5 1 NA
# 6 6 0 NA
# 7 7 0 10
# 8 8 0 40
# 9 9 1 45
data
df1 <- read.table(text="
id married hrs_workperwk
1 1 40
2 1 55
3 1 70
4 0 -1
5 1 99
6 0 -2
7 0 10
8 0 40
9 1 45",h=T,strin=F)
Upvotes: 2
Reputation: 887158
We can use replace
df %>%
mutate(hrs_workperwk = replace(hrs_workperwk, hrs_workperwk %in% c(-1, -2, 99), NA))
# id married hrs_workperwk
#1 1 1 40
#2 2 1 55
#3 3 1 70
#4 4 0 NA
#5 5 1 NA
#6 6 0 NA
#7 7 0 10
#8 8 0 40
#9 9 1 45
Or another option is case_when
df %>%
mutate(hrs_workperwk = case_when(hrs_workperwk %in% c(-1, -2, 99)~ NA_integer_,
TRUE ~ hrs_workperwk))
Upvotes: 5
Reputation: 33782
You can use %in%
:
want <- have %>%
mutate(hrs_workperwk = ifelse(hrs_workperwk %in% c(-1, -2, 99), NA, hrs_workperwk))
Upvotes: 3