J.Q
J.Q

Reputation: 1031

Recode on multiple conditions dplyr

I want to recode a variable to missing if it takes on one of three values in dplyr. Consider the following data frame have:

id  married hrs_workperwk
1   1       40
2   1       55
3   1       70
4   0       -1
5   1       99
6   0       -2
7   0       10
8   0       40
9   1       45

-1, -2, and 99 are illegal values. The new data frame want should look like this:

id  married hrs_workperwk
1   1       40
2   1       55
3   1       70
4   0       NA
5   1       NA
6   0       NA
7   0       10
8   0       40
9   1       45

I could use base R to solve this quickly, but dplyr is often convenient to work in when I'm already using mutate(). Alas, that means I currently use multiple nested if_else() functions:

want <- mutate(have, 
hrs_workperwk = if_else(hrs_workperwk < 0, as.numeric(NA), 
                if_else(hrs_workperwk = 99, as.numeric(NA), hrs_workperwk)))

Is there a way to do this with only one if_else() function? Ideally something like this:

want <- mutate(have, 
hrs_workperwk = if_else(hrs_workperwk = c(-2, -1, 99), as.numeric(NA), hrs_workperwk))

Upvotes: 1

Views: 4437

Answers (3)

moodymudskipper
moodymudskipper

Reputation: 47320

In base R:

df1$hrs_workperwk[df1$hrs_workperwk %in% c(-1,-2,99)] <- NA

or

is.na(df1$hrs_workperwk) <- df1$hrs_workperwk %in% c(-1,-2,99)

output for both cases:

#   id married hrs_workperwk
# 1  1       1            40
# 2  2       1            55
# 3  3       1            70
# 4  4       0            NA
# 5  5       1            NA
# 6  6       0            NA
# 7  7       0            10
# 8  8       0            40
# 9  9       1            45

data

df1 <- read.table(text="
id  married hrs_workperwk
1   1       40
2   1       55
3   1       70
4   0       -1
5   1       99
6   0       -2
7   0       10
8   0       40
9   1       45",h=T,strin=F)

Upvotes: 2

akrun
akrun

Reputation: 887158

We can use replace

df %>%
  mutate(hrs_workperwk = replace(hrs_workperwk, hrs_workperwk %in% c(-1, -2, 99), NA))
#  id married hrs_workperwk
#1  1       1            40
#2  2       1            55
#3  3       1            70
#4  4       0            NA
#5  5       1            NA
#6  6       0            NA
#7  7       0            10
#8  8       0            40
#9  9       1            45

Or another option is case_when

df %>%
   mutate(hrs_workperwk = case_when(hrs_workperwk %in% c(-1, -2, 99)~ NA_integer_,
                      TRUE ~ hrs_workperwk))

Upvotes: 5

neilfws
neilfws

Reputation: 33782

You can use %in%:

want <- have %>% 
  mutate(hrs_workperwk = ifelse(hrs_workperwk %in% c(-1, -2, 99), NA, hrs_workperwk))

Upvotes: 3

Related Questions