Reputation: 752
A while ago I asked for help on how to do the reverse of what I want to do now, this discussion can be found here. Anyway, I now need to join my data back into the format it once was. That is, to join the separate rows (each containing a word), into one sentence per ID. For example:
Input:
id word
1 Lorem
1 ipsum
1 dolor
1 sit
1 amet
2 consectetur
2 adipiscing
2 elit
3 Donec
...
Output:
id text
1 Lorem ipsum dolor sit amet
2 consectetur adipiscing elit
3 Donec euismod enim quis
4 nunc fringilla sodales
5 Etiam tempor ligula vitae
6 pellentesque dictum
At first, I tried to do this with the reshape
package and its melt()
and cast()
functions. I also tried the tidyr
package. However, these functions rely on a variable name column specifying the column name for each of the new columns. Not exactly my case (and each sentence can be of different length).
How can I do this task in R
?
Upvotes: 1
Views: 628
Reputation: 886948
We can use data.table
. We convert the 'data.frame' to 'data.table' (setDT(df1)
), grouped by 'id', we paste
the 'word' together.
library(data.table)
setDT(df1)[, list(text= paste(word, collapse=' ')), by = id]
# id text
#1: 1 Lorem ipsum dolor sit amet
#2: 2 consectetur adipiscing elit
#3: 3 Donec
Or using dplyr
, we can similarly group by 'id' and paste
the 'word' column.
library(dplyr)
df1 %>%
group_by(id) %>%
summarise(text= paste(word, collapse=' '))
Or a base R
option is
aggregate(word~id, df1, FUN = paste, collapse=' ')
df1 <- structure(list(id = c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L),
word = c("Lorem",
"ipsum", "dolor", "sit", "amet", "consectetur", "adipiscing",
"elit", "Donec")), .Names = c("id", "word"), class = "data.frame",
row.names = c(NA, -9L))
Upvotes: 2