SiH
SiH

Reputation: 1546

how to use mutate_if to change values

How can I use mutate_if() to change values of b to NA in case a > 25

I can do it with ifelse but I feel mutate_if is created for such task.

library(tidyverse)

tbl <- tibble(a = c(10, 20, 30, 40, 10, 60),
              b = c(12, 23, 34, 45, 56, 67))

Upvotes: 6

Views: 8090

Answers (4)

Ronak Shah
Ronak Shah

Reputation: 388972

And for completion here is base R and data.table variant.

tbl$b[tbl$a > 25] <- NA
tbl

#     a     b
#  <dbl> <dbl>
#1    10    12
#2    20    23
#3    30    NA
#4    40    NA
#5    10    56
#6    60    NA

In data.table -

library(data.table)
setDT(tbl)
tbl[a > 25, b := NA]
tbl

Upvotes: 1

akrun
akrun

Reputation: 887088

We can use replace

library(dplyr)
tbl %>%
      mutate(b = replace(b, a > 25, NA))

-output

# A tibble: 6 x 2
      a     b
  <dbl> <dbl>
1    10    12
2    20    23
3    30    NA
4    40    NA
5    10    56
6    60    NA

Upvotes: 2

M Daaboul
M Daaboul

Reputation: 224

The mutate_if() variant applies a predicate function (a function that returns TRUE or FALSE) to determine the relevant subset of columns. So the mutate_if condition will apply to all columns and in the example provided below, you can see it uses. Examples of usage is performing an mathematical operation on numeric fields, etc.

https://dplyr.tidyverse.org/reference/mutate_all.html

function (.tbl, .predicate, .funs, ...)

library(dplyr)

# Below code gets the job done but as Hugh Allan explained it is probably not 
  the right approach

tbl %>%
  mutate_if(colnames(tbl) != 'a', ~ifelse(a > 25, NA, .))

# A tibble: 6 x 2
      a     b
  <dbl> <dbl>
1    10    12
2    20    23
3    30    NA
4    40    NA
5    10    56
6    60    NA

Upvotes: 4

hugh-allan
hugh-allan

Reputation: 1370

In this small example, I'm not sure that you actually need mutate_if(). mutate_if is designed to use the _if part to determine which columns to subset and work on, rather than an if condition when modifying a value.

Rather, you can use mutate_at() to select your columns to operate on - either based on their exact name or by using vars(contains('your_string')).

See the help page for more info on the mutate_* functions: https://dplyr.tidyverse.org/reference/mutate_all.html

Here are 3 options, using mutate() and mutate_at():

# using mutate()
tbl %>% 
  mutate(
    b = ifelse(a > 25, NA, b)
  )

# mutate_at - we select only column 'b'
tbl %>% 
  mutate_at(vars(c('b')), ~ifelse(a > 25, NA, .))

# select only columns with 'b' in the col name
tbl %>% 
  mutate_at(vars(contains('b')), ~ifelse(a > 25, NA, .))

Which all produce the same output:

# A tibble: 6 x 2
      a     b
  <dbl> <dbl>
1    10    12
2    20    23
3    30    NA
4    40    NA
5    10    56
6    60    NA

I know it's not mutate_if, but I suspect you don't actually need it.

Upvotes: 6

Related Questions