BadAtCoding
BadAtCoding

Reputation: 339

Split a data frame into multiple parts based on indices in R

I have this data frame:

a <- rep(c("Like", "James", "2 weeks ago", "Jenni", "a month ago", "Max", "Max", "2 reviews · 2 photos", "3 months ago"), 
each=3)
b <- data.frame(a)
b

I want to split it into separate data frames based on whether or not a row contains "* ago", so I end up with several data frames where "* ago" is the last line in each one, like :

d <- c("Like", "James", "2 weeks ago")
e <- data.frame(d)
f <- c("Jenni", "a month ago")
g <- data.frame(f)
h <- c("Max", "Max", "2 reviews · 2 photos", "3 months ago")
i <- data.frame(h)

EXPECTED OUTPUT:

    d
Like
James
2 weeks ago


   f
Jenni
a month ago


    h
Max
Max
2 reviews · 2 photos
3 months ago

I have created an integer vector that contains the indices of the rows that contain "* ago":

c <- grep(" ago", b$a)
c

which can be used as an input into a function to split the data frame. I have been looking at the split function from base R but can't work out how to input my indices. If there is a better function than using split, I am happy to try it.

Upvotes: 0

Views: 590

Answers (1)

markus
markus

Reputation: 26373

Since your data is repetitive we can first call unique and then do the spliting based on idx.

(idx <- cumsum(grepl(" ago$", unique(b)$a)) - grepl(" ago$", unique(b)$a))
#[1] 0 0 0 1 1 2 2 2


split(unique(b), idx)
#$`0`
#            a
#1        Like
#4       James
#7 2 weeks ago
#
#$`1`
#             a
#10       Jenni
#13 a month ago
#
#$`2`
#                      a
#16                  Max
#22 2 reviews · 2 photos
#25         3 months ago

The idea as to how to create the idx was taken from @joran's comment on this answer.

Upvotes: 2

Related Questions