Split a dataframe into multiple dataframes based on specific row value in R

Question

I need to split a dataframe into 17,872 dataframes based on a header-row reoccuring in the dataframe. I need to store the newly created dataframes in a list.

My dataframe looks like:

        0        1             2      
.      .         .             .         
.      .         .             .          
.      .         .             .          
.      .         .             .         
32   Alert     Type      Response       
33     w1        x1            y1       
34     w2        x2            y2       
.      .         .             .        
.      .         .             .        
.      .         .             .        
.      .         .             .        
144 Alert     Type      Response        
145   a1        b1            c1         
146   a2        b2            c2

I want to create a new dataframe every time that the row containing "Alert, Type, Response" appears.

I have a hack-y way of getting the outcome - I create vectors containing the start and end row values to determine where each dataframe should start and stop and then use lapply.

list_data <- lapply(1:length(start_rows),
                    function(x) data[start_rows[x]:end_rows[x],])

This works, however, I am looking for a way to do this without having to determine the row value vectors, as I have 50 other dataframes that also need to be split into 17,000+ smaller dataframes.

Bart · Accepted Answer

You are probably looking for the split function. I made a small example where I split every time the b column is equal to a

(d<-data.frame(a=1:10, b=sample(letters[1:3], replace = T, size = 10)))
#>     a b
#> 1   1 a
#> 2   2 a
#> 3   3 c
#> 4   4 b
#> 5   5 c
#> 6   6 b
#> 7   7 c
#> 8   8 b
#> 9   9 c
#> 10 10 a
d$f<-cumsum(d$b=='a')
lst<-split(d, d$f)
lst
#> $`1`
#>   a b f
#> 1 1 a 1
#> 
#> $`2`
#>   a b f
#> 2 2 a 2
#> 3 3 c 2
#> 4 4 b 2
#> 5 5 c 2
#> 6 6 b 2
#> 7 7 c 2
#> 8 8 b 2
#> 9 9 c 2
#> 
#> $`3`
#>     a b f
#> 10 10 a 3

^{Created on 2021-10-05 by the reprex package (v2.0.1)}

Split a dataframe into multiple dataframes based on specific row value in R

Answers (1)

Related Questions