user7542670
user7542670

Reputation:

using replace_na() with indeterminate number of columns

My data frame looks like this:

df <- tibble(x = c(1, 2, NA),
             y = c(1, NA, 3),
             z = c(NA, 2, 3))

I want to replace NA with 0 using tidyr::replace_na(). As this function's documentation makes clear, it's straightforward to do this once you know which columns you want to perform the operation on.

df <- df %>% replace_na(list(x = 0, y = 0, z = 0))

But what if you have an indeterminate number of columns? (I say 'indeterminate' because I'm trying to create a function that does this on the fly using dplyr tools.) If I'm not mistaken, the base R equivalent to what I'm trying to achieve using the aforementioned tools is:

df[, 1:ncol(df)][is.na(df[, 1:ncol(df)])] <- 0

But I always struggle to get my head around this code. Thanks in advance for your help.

Upvotes: 2

Views: 2013

Answers (1)

akrun
akrun

Reputation: 886938

We can do this by creating a list of 0's based on the number of columns of dataset and set the names with the column names

library(tidyverse)
df %>% 
   replace_na(set_names(as.list(rep(0, length(.))), names(.)))
# A tibble: 3 x 3
#      x     y     z
#   <dbl> <dbl> <dbl>
#1     1     1     0
#2     2     0     2
#3     0     3     3

Or another option is mutate_all (for selected columns -mutate_at or base don conditions mutate_if) and applyreplace_all

df %>%
    mutate_all(replace_na, replace = 0)

With base R, it is more straightforward

df[is.na(df)] <- 0

Upvotes: 11

Related Questions