Reputation: 6874
I have some data as follows
Team Date Text
1 13/12/01 mytext1
2 12/01/13 mytext2
2 05/02/13 mytext3
3 14/02/14 mytext4
3 16/02/14 mytext5
3 01/03/14 mytext6
4 29/03/12 mytext7
I would like to merge rows where the difference between the Date column between two rows within a Team is <29 which should result in:
Team Date Text
1 13/12/01 mytext1
2,2 12/01/13,05/02/13 mytext2,mytext3
3,3,3 14/02/14,16/02/14,16/02/14 mytext4,mytext5,mytext6
4 29/03/12 mytext7
I am stuck on how to merge according to this condition. So far I have:
df%>%
mutate(diff_days = difftime(Date , lag(Date ),units="days"))%>%
group_by(Team,diff_days)%>%
summarize_all(paste,collapse=", ")
but how to implement my condition of merging rows?
Upvotes: 2
Views: 372
Reputation: 886948
We can also use
library(dplyr)
df %>%
mutate(Date = as.Date(Date, '%d/%m/%y')) %>%
group_by(Team) %>%
group_by(grp = cumsum(Date -
lag(Date, default = first(Date)) > 29), .add = TRUE) %>%
summarise(across(c(Date, Text), toString), .groups = 'drop') %>%
select(-grp)
Upvotes: 0
Reputation: 388817
Within each Team
you can create a new group whenever a difference between two consecutive dates is greater than 29. For each group collapse the Date
and Text
column in one comma-separated string.
library(dplyr)
df %>%
mutate(Date = lubridate::dmy(Date)) %>%
group_by(Team) %>%
group_by(grp = cumsum(Date -
lag(Date, default = first(Date)) > 29), .add = TRUE) %>%
summarise(across(c(Date, Text), toString)) %>%
select(-grp) %>%
ungroup
# Team Date Text
# <int> <chr> <chr>
#1 1 2001-12-13 mytext1
#2 2 2013-01-12, 2013-02-05 mytext2, mytext3
#3 3 2014-02-14, 2014-02-16, 2014-03-01 mytext4, mytext5, mytext6
#4 4 2012-03-29 mytext7
Upvotes: 3