ronzenith
ronzenith

Reputation: 383

Why two single square brackets side by side in R

I am trying to learn data-cleaning with simple code.

My central question is: what is the use of two single square brackets side by side?

Here is df as an example.

df <- data.frame(x = c(1:3, NA, NA), y = c(6:9, NA))

The following code is one of the many ways to replace NAs with, say, 99. And I think it's quite simple.

messy <- function(df, impute){
for (i in 1:nrow(df)) {
df[i, ][is.na(df[i, ])] <- impute
}
return(df)
}
clean <- messy(df, 99)
clean
  1. But why do I need to use two simple square brackets to locate the NAs.
  2. Why isn't it possible to simplify the code to be is.na(df[i, ]) <- impute ?
  3. Is there any more efficient ways to replace NAs, such as using the apply family?

Many thanks for answering.

Upvotes: 1

Views: 145

Answers (2)

jpdugo17
jpdugo17

Reputation: 7116

Here are three more ways to replace NA's with a tidyverse approach:

library(tidyverse)

df <- data.frame(x = c(1:3, NA, NA), y = c(6:9, NA))

#purrr 
map_df(df, ~replace_na(.x, 99))
#> # A tibble: 5 x 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     6
#> 2     2     7
#> 3     3     8
#> 4    99     9
#> 5    99    99

#transmute/across
df %>% transmute(across(everything(), ~replace_na(.x, 99)))
#>    x  y
#> 1  1  6
#> 2  2  7
#> 3  3  8
#> 4 99  9
#> 5 99 99

#transmute_if
df %>% transmute_if(is.numeric, ~replace_na(.x, 99))
#>    x  y
#> 1  1  6
#> 2  2  7
#> 3  3  8
#> 4 99  9
#> 5 99 99

Created on 2021-06-14 by the reprex package (v2.0.0)

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389175

That is a very complex way of replacing NA's. You can reduce the function to -

messy <- function(df, impute){
  df[is.na(df)] <- impute
  df
}

clean <- messy(df, 99)
clean

#   x  y
#1  1  6
#2  2  7
#3  3  8
#4 99  9
#5 99 99

You can use apply family of functions as well but they are not needed here since is.na works on dataframes directly.

Upvotes: 2

Related Questions