Reputation: 555
I am trying to use the package hash
in conjunction with dplyr
to modify a column of a table.
Specifically, I have a hashed key-value pair dictionary, which has the column elements that I want replaced as its keys, and what I would like them to be replaced with, as its values.
Below is a minimal reproducible example:
# Load packages.
pacman::p_load(dplyr, hash)
# Create tibble.
id <- c("0001", "0002", "0003", "0004", "0005", "0006")
colour <- c("blue", "green", "red", "purple", "purple", "pink")
tib <- as_tibble(cbind(id, colour))
# Create hashed dictionary.
k <- c("0005", "0006")
v <- c("0007", "0008")
dictionary <- hash(keys = k, values = v)
The following calls work as expected:
> id[1] %in% keys(dictionary)
# [1] FALSE
> values(dictionary, keys = "0005")[[1]]
# "0007"
However, when I try to incorporate them into a mutate call...
# Use dictionary to replace values.
tib %>%
mutate(id = if_else(id %in% keys(dictionary),
values(dictionary, keys = id)[[1]],
id))
The following error is thrown:
Error in FUN(X[[i]], ...) : object '0001' not found
Is the condition being checked for value in the id
column at once, rather than for each element of the column alone? If so, how do I get it work as intended? If not, what exactly is going on?
Upvotes: 0
Views: 678
Reputation: 8318
The problem is with the if_else(), it searches the id regardless of the condition and this raises the error:
values(dictionary[id])
Error in get(k, x) : object '0001' not found
I would suggest a different approach using lapply() which seems to me to give the expected output:
tib$id = unlist(lapply(tib['id'],FUN = function(i){if_else(tib$id == keys(dictionary), values(dictionary)[i], i)}))
Result
> tib$id = unlist(lapply(tib['id'],FUN = function(i){if_else(tib$id == keys(dictionary), values(dictionary)[i], i)}))
> tib
# A tibble: 6 x 2
id colour
<chr> <chr>
1 0001 blue
2 0002 green
3 0003 red
4 0004 purple
5 0007 purple
6 0008 pink
Upvotes: 2