Reputation: 301
I have a data frame (x) with a factor variable which has values seperated by comma. I have another data frame (y) with description for the same values. Now I want to replace the values in the data frame (x) with the description from the data frame (y). Any help would be highly appreciated.
say for example, the two data frame looks like below
data frame (x)
s.no x
1 2,5,45
2 35,5
3 45
data fram (y)
s.no x description
1 2 a
2 5 b
3 45 c
4 35 d
I need the output as below
s.no x
1 a,b,c
2 d,b
c c
Upvotes: 2
Views: 170
Reputation: 31161
With splitstackshape
:
library(splitstackshape)
cSplit(x, 'x', ',', 'long')[setDT(y), on='x'][,.(x=paste(description, collapse=',')), s.no]
# s.no x
#1: 1 a,b,c
#2: 2 b,d
#3: 3 c
Upvotes: 5
Reputation: 4187
A solution using dplyr
and tidyr
:
library(dplyr)
library(tidyr)
x %>%
separate(x, paste0('x',1:3),',',convert=TRUE) %>%
gather(var, x, -1, na.rm=TRUE) %>%
left_join(., y, by='x') %>%
group_by(s.no = s.no.x) %>%
summarise(x = paste(description,collapse = ','))
the result:
s.no x
(int) (chr)
1 1 a,b,c
2 2 d,b
3 3 c
Upvotes: 4
Reputation: 886938
We can split
the 'x' column in 'x' dataset by ',', loop over the list
, match the value with the 'x' column in 'y' to get the numeric index, get the corresponding 'description' value from 'y' and paste
it together.
x$x <- sapply(strsplit(x$x, ","), function(z)
toString(y$description[match(as.numeric(z), y$x)]))
x
# s.no x
#1 1 a, b, c
#2 2 d, b
#3 3 c
NOTE: If the 'x' column in 'x' is factor
class, use strsplit(as.character(x$x, ","))
Upvotes: 3