Reputation: 1080
I have a data frame as such:
> df <- data.frame(var1 = c("1 Merge 2 ","3"),
+ var2 = c("1","2 Merge 3"),
+ var3 = "1")
> df
var1 var2 var3
1 1 Merge 2 1 1
2 3 2 Merge 3 1
When ' Merge ' is in an element, i'd like to split the row by the strings before and after ' Merge '. I would like this to be generalised so if we had 1 Merge 2 Merge 3
, there should be 3 rows outputted from this.
Here is an example of my desired output for above:
> desired_df <- data.frame(var1 = c("1","2","3","3"),
+ var2 = c("1","1","2","3"),
+ var3 = c("1","1","1","1"))
> desired_df
var1 var2 var3
1 1 1 1
2 2 1 1
3 3 2 1
4 3 3 1
Update:
Another case is when we have more than one column with ' Merge ':
> df
var1 var2 var3
1 1 Merge 2 Merge 4 1 1 Merge 2
2 3 2 Merge 3 1
> desired_df <- data.frame(var1 = c(1,1,2,2,4,4,3,3),
+ var2 = c(1,1,1,1,1,1,2,3),
+ var3 = c(1,2,1,2,1,2,1,1))
> desired_df
var1 var2 var3
1 1 1 1
2 1 1 2
3 2 1 1
4 2 1 2
5 4 1 1
6 4 1 2
7 3 2 1
8 3 3 1
Solution:
We can use pipes and separate_rows()
over each variable like so:
df %>%
separate_rows(var1, sep = " Merge ") %>%
separate_rows(var2, sep = " Merge ") %>%
separate_rows(var3, sep = " Merge ")
Upvotes: 1
Views: 92
Reputation: 34621
You can use tidyr::separate_rows()
.
library(tidyr)
df %>%
separate_rows(var1:var3, sep = " Merge ")
var1 var2 var3
1 1 1 1
2 2 1 1
3 3 2 1
4 3 3 1
You can get your desired result with the expanded example by putting it in a loop although I'm sure there are better approaches:
for(i in seq_along(df)) {
df <- separate_rows(df, i, sep = " Merge ")
}
df
var1 var2 var3
1 1 1 1
2 1 1 2
3 2 1 1
4 2 1 2
5 3 2 1
6 3 3 1
Upvotes: 1