Reputation: 383
I am trying to learn data-cleaning with simple code.
My central question is: what is the use of two single square brackets side by side?
Here is df
as an example.
df <- data.frame(x = c(1:3, NA, NA), y = c(6:9, NA))
The following code is one of the many ways to replace NAs with, say, 99. And I think it's quite simple.
messy <- function(df, impute){
for (i in 1:nrow(df)) {
df[i, ][is.na(df[i, ])] <- impute
}
return(df)
}
clean <- messy(df, 99)
clean
is.na(df[i, ]) <- impute
?Many thanks for answering.
Upvotes: 1
Views: 145
Reputation: 7116
Here are three more ways to replace NA's with a tidyverse approach:
library(tidyverse)
df <- data.frame(x = c(1:3, NA, NA), y = c(6:9, NA))
#purrr
map_df(df, ~replace_na(.x, 99))
#> # A tibble: 5 x 2
#> x y
#> <dbl> <dbl>
#> 1 1 6
#> 2 2 7
#> 3 3 8
#> 4 99 9
#> 5 99 99
#transmute/across
df %>% transmute(across(everything(), ~replace_na(.x, 99)))
#> x y
#> 1 1 6
#> 2 2 7
#> 3 3 8
#> 4 99 9
#> 5 99 99
#transmute_if
df %>% transmute_if(is.numeric, ~replace_na(.x, 99))
#> x y
#> 1 1 6
#> 2 2 7
#> 3 3 8
#> 4 99 9
#> 5 99 99
Created on 2021-06-14 by the reprex package (v2.0.0)
Upvotes: 0
Reputation: 389175
That is a very complex way of replacing NA
's. You can reduce the function to -
messy <- function(df, impute){
df[is.na(df)] <- impute
df
}
clean <- messy(df, 99)
clean
# x y
#1 1 6
#2 2 7
#3 3 8
#4 99 9
#5 99 99
You can use apply
family of functions as well but they are not needed here since is.na
works on dataframes directly.
Upvotes: 2