Reputation: 31
My dataframe has few values as 9.969210e+36, I want to replace them with NA. It looks like
# A tibble: 1,308 x 3
IMD CRU dts
<dbl> <dbl> <date>
1 9.97e36 9.97e36 1901-01-01
2 9.97e36 9.97e36 1902-01-01
3 9.97e36 9.97e36 1903-01-01
4 9.97e36 9.97e36 1904-01-01
dput(head(df))
structure(list(IMD = c(9.96920996838687e+36, 9.96920996838687e+36,
9.96920996838687e+36, 9.96920996838687e+36, 9.96920996838687e+36,
9.96920996838687e+36), CRU = c(9.96920996838687e+36, 9.96920996838687e+36,
9.96920996838687e+36, 9.96920996838687e+36, 9.96920996838687e+36,
9.96920996838687e+36), dts = structure(c(-25202, -24837, -24472,
-24107, -23741, -23376), class = "Date")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -6L))
I followed R - Replace specific value contents with NA as
df %>% mutate_at(vars(IMD, CRU), na_if, 9.969210e+36)
df %>% na_if(x=as.vector(df$IMD),y=9.97e36)
None of the above shows NA values and returns same old dataframe. Any help is appreciated
Upvotes: 0
Views: 2319
Reputation: 5766
na_if
works on vectors, not data.frame, thus your first attempt using mutate
would be most correct. Furthermore, it compares exact values to replace with NA
.
However, your very large values are only displayed with 15 digits; I suspect there are a lot, lot more. Therefore, no values are matched exactly to your conditional (y
). This is a common problem when trying to exactly compare to real values.
Also note that you are trying to compare the two values. Which is largest?
9.969210e+36
9.96920996838687e+36
You can do it quickly by:
df %>%> mutate(
IMD=ifelse(IMD > 9e36, NA, IMD),
CRU=ifelse(CRU > 9e36, NA, CRU)
)
or create a function as,
na_when_larger <- function(x, y) {
x[x > y] <- NA
x
}
df %>% mutate_at(vars(IMD, CRU), na_when_larger, 9.96e+36)
(try typing na_if
into the console without parenthesis).
Upvotes: 1