Reputation: 5154
I have a data.frame
with two columns of strings as follows.
nos <- c("JM1", "JM2", "JM3", "JM1", "JM5", "JM45", "JM3", "JM45")
ren <- c("book, vend, spent", "marigold, fortune", "smoke, parchment, smell, book", "mental, past, create", "key, fortune, mask, federal", "tell, warn, slip", "wire, dg333, uv12", "tell, warn, slip, furniture")
d <- data.frame(nos, ren, stringsAsFactors=FALSE)
d
nos ren
1 JM1 book, vend, spent
2 JM2 marigold, fortune
3 JM3 smoke, parchment, smell, book
4 JM1 mental, past, create
5 JM5 key, fortune, mask, federal
6 JM45 tell, warn, slip
7 JM3 wire, dg333, uv12
8 JM45 tell, warn, slip, furniture
I want to concatenate the elements of ren
column according to the strings in nos
column.
For example in the sample data, the elements associated with JM1 which occurs twice should be merged ("book, vend, spent, mental, past, create").
Also the elements associated with JM45 should be merged keeping only unique words. ("tell, warn, slip, furniture")
The output that I am trying to get is like below.
nos1 <- c("JM1", "JM2", "JM3", "JM5", "JM45")
ren1 <- c("book, vend, spent, mental, past, create", "marigold, fortune", "smoke, parchment, smell, book, wire, dg333, uv12", "key, fortune, mask, federal", "tell, warn, slip, furniture")
out <- data.frame(nos1, ren1, stringsAsFactors=FALSE)
out
nos1 ren1
1 JM1 book, vend, spent, mental, past, create
2 JM2 marigold, fortune
3 JM3 smoke, parchment, smell, book, wire, dg333, uv12
4 JM5 key, fortune, mask, federal
5 JM45 tell, warn, slip, furniture
How to do this in R
? My original data set has thousands of such rows in a data.frame
.
Upvotes: 3
Views: 2487
Reputation: 605
Using plyr
package you could do it like this
ddply(d, .(nos), summarise, ren1=paste0(ren, collapse=", "))
or if you want unique values in ren1
like this
ddply(d, .(nos), summarise,
paste0(unique(unlist(strsplit(ren, split=", "))), collapse=", "))
Upvotes: 3