Reputation: 339
I have this data frame:
a <- rep(c("Like", "James", "2 weeks ago", "Jenni", "a month ago", "Max", "Max", "2 reviews · 2 photos", "3 months ago"),
each=3)
b <- data.frame(a)
b
I want to split it into separate data frames based on whether or not a row contains "* ago", so I end up with several data frames where "* ago" is the last line in each one, like :
d <- c("Like", "James", "2 weeks ago")
e <- data.frame(d)
f <- c("Jenni", "a month ago")
g <- data.frame(f)
h <- c("Max", "Max", "2 reviews · 2 photos", "3 months ago")
i <- data.frame(h)
EXPECTED OUTPUT:
d
Like
James
2 weeks ago
f
Jenni
a month ago
h
Max
Max
2 reviews · 2 photos
3 months ago
I have created an integer vector that contains the indices of the rows that contain "* ago":
c <- grep(" ago", b$a)
c
which can be used as an input into a function to split the data frame. I have been looking at the split function from base R but can't work out how to input my indices. If there is a better function than using split, I am happy to try it.
Upvotes: 0
Views: 590
Reputation: 26373
Since your data is repetitive we can first call unique
and then do the split
ing based on idx
.
(idx <- cumsum(grepl(" ago$", unique(b)$a)) - grepl(" ago$", unique(b)$a))
#[1] 0 0 0 1 1 2 2 2
split(unique(b), idx)
#$`0`
# a
#1 Like
#4 James
#7 2 weeks ago
#
#$`1`
# a
#10 Jenni
#13 a month ago
#
#$`2`
# a
#16 Max
#22 2 reviews · 2 photos
#25 3 months ago
The idea as to how to create the idx
was taken from @joran's comment on this answer.
Upvotes: 2