Reputation: 4784
I have been using the dplyr::recode()
function to recode some variables. I have one character variable with some empty strings that I would also like to recode. But if I refer to the empty string in the arguments to the function, I get an error.
# input
x <- c("a", "b", "", "x", "y", "z")
# desired output
c("Apple", "Banana", "Missing", "x", "y", "z")
dplyr::recode(x, "a"="Apple", "b"="Banana", ""="Missing")
Error: attempt to use zero-length variable name
If I treat the empty string as a missing value, the function leaves it as an empty string.
dplyr::recode(x, "a"="Apple", "b"="Banana", .missing="Missing")
[1] "Apple" "Banana" "" "x" "y" "z"
How can I recode the values to get the desired output?
Upvotes: 2
Views: 1840
Reputation: 16277
You can use na_if
to get .missing
working properly:
x <- c("a", "b", "", "x", "y", "z")
dplyr::recode(na_if(x,""), "a"="Apple", "b"="Banana", .missing="Missing")
[1] "Apple" "Banana" "Missing" "x" "y" "z"
Upvotes: 7
Reputation: 534
In these cases, I use ifelse
. Your example would be: x <- ifelse(x == "", "Missing", x)
.
In a data.frame
context, you can use it inside mutate
:
df_x <- data.frame(col1 = c("a", "b", "", "x", "y", "z"))
df_new <- df_x %>%
mutate(col1 = ifelse(col1 == "", "Missing", col1))
Upvotes: 0
Reputation: 38510
Why not use base R's factor
?
myFac <- factor(x, levels=x, labels=c("Apple", "Banana", "Missing", "x", "y", "z"))
myFac
[1] Apple Banana Missing x y z
Levels: Apple Banana Missing x y z
If desired, you can convert this to a character vector:
as.character(myFac)
[1] "Apple" "Banana" "Missing" "x" "y" "z"
Upvotes: 2