Ali
Ali

Reputation: 1080

Loop through elements in data frame and add a row based on condition

I have a data frame as such:

> df <- data.frame(var1 = c("1 Merge 2 ","3"), 
+                  var2 = c("1","2 Merge 3"), 
+                  var3 = "1")
> df
        var1      var2 var3
1 1 Merge 2          1    1
2          3 2 Merge 3    1

When ' Merge ' is in an element, i'd like to split the row by the strings before and after ' Merge '. I would like this to be generalised so if we had 1 Merge 2 Merge 3, there should be 3 rows outputted from this.

Here is an example of my desired output for above:

> desired_df <- data.frame(var1 = c("1","2","3","3"),
+                          var2 = c("1","1","2","3"),
+                          var3 = c("1","1","1","1"))
> desired_df
  var1 var2 var3
1    1    1    1
2    2    1    1
3    3    2    1
4    3    3    1

Update:

Another case is when we have more than one column with ' Merge ':

> df
               var1      var2      var3
1 1 Merge 2 Merge 4         1 1 Merge 2
2                 3 2 Merge 3         1
> desired_df <- data.frame(var1 = c(1,1,2,2,4,4,3,3),
+                          var2 = c(1,1,1,1,1,1,2,3),
+                          var3 = c(1,2,1,2,1,2,1,1))
> desired_df
  var1 var2 var3
1    1    1    1
2    1    1    2
3    2    1    1
4    2    1    2
5    4    1    1
6    4    1    2
7    3    2    1
8    3    3    1

Solution:

We can use pipes and separate_rows() over each variable like so:

df %>% 
  separate_rows(var1, sep = " Merge ") %>% 
  separate_rows(var2, sep = " Merge ") %>% 
  separate_rows(var3, sep = " Merge ")

Upvotes: 1

Views: 92

Answers (1)

lroha
lroha

Reputation: 34621

You can use tidyr::separate_rows().

library(tidyr)

df %>%
  separate_rows(var1:var3, sep = " Merge ")

  var1 var2 var3
1    1    1    1
2    2    1    1
3    3    2    1
4    3    3    1

You can get your desired result with the expanded example by putting it in a loop although I'm sure there are better approaches:

for(i in seq_along(df)) {
 df <- separate_rows(df, i, sep = " Merge ")
}

df

  var1 var2 var3
1    1    1    1
2    1    1    2
3    2    1    1
4    2    1    2
5    3    2    1
6    3    3    1

Upvotes: 1

Related Questions