Reputation: 1878
Is there any way to find gaps between multiple timelines. For example my data looks like the following:
library(plyr);library(dplyr)
library(googleVis)
df <- data.frame(Language = structure(c(rep("English",7), rep("German",5), rep("French", 10)), class = "character"),
Students = c(LETTERS[1:7], LETTERS[1:5], LETTERS[1:10]),
Start = structure(c(16713,16713,16713,16744,16713,16714,16754,16729,16729,16729,16750,16769,
16724,16724,16745,16724,16759,16766,16723,16722,16736,16796), class = "Date"),
End = structure(c(16762,16720,16762,16755,16720,16764,16762,16765,16765,16749,16761,16770,16758,
16744,16758,16764,16765,16766,16726,16723,16758,16806), class = "Date"))
ddply(df, .(Language), summarise,
FirstDay = min(Start),
LastDay = max(End),
Duration = LastDay - FirstDay)
plot(gvisTimeline(data=df, rowlabel = "Class", start = "Start", end = "End", options=list(width=600, height=1000) ))
I am after calculating gaps when none students are taking a class. Gaps are highlighted in red in the following chart.
Upvotes: 3
Views: 374
Reputation: 215117
This is a fairly classic problem. The solution about this is to filter rows based on if the start date is larger than previous maximum end date assuming rows are sorted by start date before hand. lag
function and cummax()
can be used to find out previous max end date, and since cummax()
is not defined for Date class, we can convert it to integer, apply cummax
and then convert it back:
library(dplyr)
df %>%
arrange(Start) %>% group_by(Language) %>%
mutate(End_Max = lag(as.Date(cummax(as.integer(End)), "1970-01-01"))) %>%
filter(Start > End_Max + 1) %>% select(Language, End_Max, Start)
# Source: local data frame [2 x 3]
# Groups: Language [2]
# Language End_Max Start
# <fctr> <date> <date>
#1 German 2015-11-26 2015-11-30
#2 French 2015-11-27 2015-12-27
Upvotes: 5