Alex
Alex

Reputation: 2077

Split data.frame row into multiple rows based on commas

I am attempting to split a row in a data.frame based on the character sequence ", ". Here's an example:

mydat <- data.frame(v1 = c("name, name2", "name3", "name4, name5"),
                v2 = c("1, 2", "3", "4, 5"), 
                v3 = c(1, 2, 3))

What I would like to end up with is a data.frame like so:

 v1   v2   v3
name  1    1
name2  2   1
name3  3   2
name4  4   2
name5  5   3

Any suggestions?

Upvotes: 4

Views: 1289

Answers (3)

David Arenburg
David Arenburg

Reputation: 92282

Here's another way using data.table package and its new tstrsplit function

library(data.table) # v >= 1.9.5
setDT(mydat)[, lapply(.SD, tstrsplit, ", "), by = v3]
#    v3    v1 v2
# 1:  1  name  1
# 2:  1 name2  2
# 3:  2 name3  3
# 4:  3 name4  4
# 5:  3 name5  5

Upvotes: 6

Abdou
Abdou

Reputation: 13274

For posterity, users with an inclination for tidyverse packages can use tidyr's separate_rows function along with select from dplyr (to maintain the order of the columns) to get this done:

library(tidyverse)

mydat %>% separate_rows(v1,v2,sep=", ") %>% 
        select(v1, v2, v3)

#     v1 v2 v3
#1  name  1  1
#2 name2  2  1
#3 name3  3  2
#4 name4  4  3
#5 name5  5  3

Upvotes: 1

Justin Klevs
Justin Klevs

Reputation: 651

This should work.

install.packages("splitstackshape")
library(splitstackshape)
out <- concat.split.multiple(mydat, c("v1","v2"), seps=",", "long")


out
     v1 v2 v3
1:  name  1  1
2: name2  2  1
3: name3  3  2
4: name4  4  3
5: name5  5  3

Upvotes: 6

Related Questions