Reputation: 2077
I am attempting to split a row in a data.frame
based on the character sequence ", ". Here's an example:
mydat <- data.frame(v1 = c("name, name2", "name3", "name4, name5"),
v2 = c("1, 2", "3", "4, 5"),
v3 = c(1, 2, 3))
What I would like to end up with is a data.frame
like so:
v1 v2 v3
name 1 1
name2 2 1
name3 3 2
name4 4 2
name5 5 3
Any suggestions?
Upvotes: 4
Views: 1289
Reputation: 92282
Here's another way using data.table
package and its new tstrsplit
function
library(data.table) # v >= 1.9.5
setDT(mydat)[, lapply(.SD, tstrsplit, ", "), by = v3]
# v3 v1 v2
# 1: 1 name 1
# 2: 1 name2 2
# 3: 2 name3 3
# 4: 3 name4 4
# 5: 3 name5 5
Upvotes: 6
Reputation: 13274
For posterity, users with an inclination for tidyverse
packages can use tidyr
's separate_rows
function along with select
from dplyr
(to maintain the order of the columns) to get this done:
library(tidyverse)
mydat %>% separate_rows(v1,v2,sep=", ") %>%
select(v1, v2, v3)
# v1 v2 v3
#1 name 1 1
#2 name2 2 1
#3 name3 3 2
#4 name4 4 3
#5 name5 5 3
Upvotes: 1
Reputation: 651
This should work.
install.packages("splitstackshape")
library(splitstackshape)
out <- concat.split.multiple(mydat, c("v1","v2"), seps=",", "long")
out
v1 v2 v3
1: name 1 1
2: name2 2 1
3: name3 3 2
4: name4 4 3
5: name5 5 3
Upvotes: 6