R: splitting dataframe into distinct subgroups containing sequence of groups

Question

This question is similar to one already answered: R: Splitting dataframe into subgroups consisting of every consecutive 2 groups

However, rather than splitting into subgroups that have a type in common, I need to split into subgroups that contain two consecutive types and are distinct. The groups in my actual data have differing numbers of rows as well.

df <- data.frame(ID=c('1','1','1','1','1','1','1'), Type=c('a','a','b','c','c','d','d'), value=c(10,2,5,3,7,3,9))

   ID Type value
1  1    a    10
2  1    a     2
3  1    b     5
4  1    c     3
5  1    c     7
6  1    d     3
7  1    d     9

So subgroup 1 would be Type a and b:

   ID Type value
1  1    a    10
2  1    a     2
3  1    b     5

And subgroup 2 would be Type c and d:

   ID Type value
4  1    c     3
5  1    c     7
6  1    d     3
7  1    d     9

I have tried manipulating the code from this previous example, but I can't figure out how to make this happen without having overlapping Types in each group. Any help would be greatly appreciated - thanks!

EDIT: thanks for pointing out I didn't actually include the correct link.

Rui Barradas · Accepted Answer

Here is a rle way, written as a function. Pass the data.frame and the split column name as a character string.

df <- data.frame(ID=c('1','1','1','1','1','1','1'), 
                 Type=c('a','a','b','c','c','d','d'), 
                 value=c(10,2,5,3,7,3,9))

split_two <- function(x, col) {
  r <- rle(x[[col]])
  r$values[c(FALSE, TRUE)] <- r$values[c(TRUE, FALSE)]
  split(x, inverse.rle(r))
}
split_two(df, "Type")
#> $a
#>   ID Type value
#> 1  1    a    10
#> 2  1    a     2
#> 3  1    b     5
#> 
#> $c
#>   ID Type value
#> 4  1    c     3
#> 5  1    c     7
#> 6  1    d     3
#> 7  1    d     9

^{Created on 2023-02-09 with reprex v2.0.2}

R: splitting dataframe into distinct subgroups containing sequence of groups

Answers (2)

Related Questions