Reputation: 10051
Given a small data set as follows:
df <- structure(list(date = c("2021-09", "2021-10", "2021-11", "2021-12",
"2021-06", "2021-10"), act_direction = c("decrease", "increase",
NA, NA, "unchanged", "unchanged"), pred_direction = c(NA, "decrease",
NA, NA, "decrease", "increase"), direction_acc = c("true", "-",
"-", "true", "false", "false")), class = "data.frame", row.names = c(NA,
-6L))
df:
date act_direction pred_direction direction_acc
1 2021-09 decrease <NA> true
2 2021-10 increase decrease -
3 2021-11 <NA> <NA> -
4 2021-12 <NA> <NA> true
5 2021-06 unchanged decrease false
6 2021-10 unchanged increase false
I try to update direction_acc
based on act_direction
column, more specifically, if act_direction==unchanged
, then I want to update to the content of direction_acc
to true
, ignoring whatever its original values are.
The expected result:
date act_direction pred_direction direction_acc
1 2021-09 decrease <NA> true
2 2021-10 increase decrease -
3 2021-11 <NA> <NA> -
4 2021-12 <NA> <NA> true
5 2021-06 unchanged decrease true
6 2021-10 unchanged increase true
I used code below which I find no error, but it returns an outcome which is out of expectation since other values in direction_acc
were changed to NA
s:
df %>%
# mutate_all(na_if, '') %>% # I will replace empty cell '' to NA as example data
mutate(direction_acc = ifelse(act_direction == 'unchanged',
'true',
as.character(direction_acc)))
Result:
date act_direction pred_direction direction_acc
1 2021-09 decrease <NA> true
2 2021-10 increase decrease -
3 2021-11 <NA> <NA> <NA>
4 2021-12 <NA> <NA> <NA>
5 2021-06 unchanged decrease true
6 2021-10 unchanged increase true
So my question is why when act_direction
with NA
s, direction_acc
also become NA
s, and how could we achieve that correctly?
Upvotes: 0
Views: 106
Reputation: 160637
df$direction_acc[df$act_direction == "unchanged"] <- "true"
df
# date act_direction pred_direction direction_acc
# 1 2021-09 true
# 2 2021-10 decrease -
# 3 2021-11 -
# 4 2021-12 true
# 5 2021-06 unchanged decrease true
# 6 2021-10 unchanged increase true
library(dplyr)
df %>%
mutate(
direction_acc = if_else(act_direction == "unchanged", "true", direction_acc)
)
If I add the NA
s as you have (I'll use across
as mutate_if
is deprecated), then we can change ==
to %in%
for the desired effect.
df %>%
mutate(across(where(is.character), ~ na_if(., ""))) %>%
mutate(
direction_acc = if_else(act_direction %in% "unchanged", "true", direction_acc)
)
# date act_direction pred_direction direction_acc
# 1 2021-09 <NA> <NA> true
# 2 2021-10 <NA> decrease -
# 3 2021-11 <NA> <NA> -
# 4 2021-12 <NA> <NA> true
# 5 2021-06 unchanged decrease true
# 6 2021-10 unchanged increase true
Upvotes: 1