user882670
user882670

Reputation:

Replace NA with unknown

I'm trying to replace the NA's of column GENDER_M of objeto1 dataframe.

Nothing of the following works:

replace_na(objeto1$GENDER_M, "unknown")

mutate(GENDER_M = replace_na(GENDER_M, "unknown"))

mutate(objeto1, GENDER_M = ifelse(is.na(GENDER_M), "unknown", GENDER_M))

replace(is.na(GENDER_M), "unknown")

Yes, I've read this page and a dozen others.

Can anyone help?

Thanks!

Upvotes: 0

Views: 5057

Answers (1)

divibisan
divibisan

Reputation: 12155

All the tidyverse functions return the modified data frame, they don't modify it in place, so you need to assign the value when you return it. If we make an example dataframe:

df <- structure(list(mpg = c(21, 21, 22.8, 21.4, NA, NA), cyl = c(6, 
6, 4, 6, 8, 6)), class = "data.frame", row.names = c(NA, -6L))

   mpg cyl
1 21.0   6
2 21.0   6
3 22.8   4
4 21.4   6
5   NA   8
6   NA   6

We can replace NA in several ways:

df <- df %>%
    replace_na(list(mpg = 'unknown'))

df <- df %>%
    mutate(mpg = ifelse(is.na(mpg), 'unknown', mpg))

Both of which return the same thing:

df
      mpg cyl
1      21   6
2      21   6
3    22.8   4
4    21.4   6
5 unknown   8
6 unknown   6

You could also use base R:

df[is.na(df)] <- 'unknown'

Beware: there is a risk to this: Each variable in a data frame can only have one type (ie. numeric, logical, character). Adding character values to these variables will cause the whole variable to be converted to character, which may cause problems when trying to due numeric calculations in the future. This is why the special value NA is strongly preferred over other values for identifying missing data.

Upvotes: 4

Related Questions