Reputation: 51
I have a list of Lectures for a university course stored in a data-frame. This is a large complex table with over 1000 rows. I have used simple time in the example, but this is actually date time in the format %d %b %Y %H:%M. I think I should be able to extrapolate to the more complex usage.
essentially:
ModuleCode1 ModuleName Lecturer StartTime EndTime Course
11A Hist1 Bob 10:30 12:30 Hist
13A Hist2 Bob 14:30 15:30 Hist
13C Hist3 Steve 11:45 12:45 Hist
15B Hist4 Bob 09:40 10:40 Hist
17B Hist5 Bob 14:00 15:00 Hist
I am trying to create an output data frame which determines which modules clash in the timetable and at which times. For example:
ModuleCode1 StartTime EndTime ModuleCode2 StartTime EndTime
11A 10:30 12:30 15B 09:40 10:40
11A 10:30 12:30 13C 11:45 12:45
13A 10:30 12:30 17B 14:00 15:00
There are a multitude of questions on date time overlaps, but the ones that I can find seem to either work with 2 dataframes, or I can't understand them. I have come across the lubridate and IRanges packages, but cannot work out this specific implementation with date time in a single data frame. It seems as though something which would be generally useful, and most likely would have a simple implementation I am missing. Grateful for any help.
Upvotes: 0
Views: 65
Reputation: 270170
Here is an sqldf solution. The intervals do NOT overlap iff a.StartTime > b.EndTime or a.EndTime < b.StartTime so they do overlap exactly when the negation of this statement is true, hence:
library(sqldf)
sqldf("select a.ModuleCode1, a.StartTime, a.EndTime, b.ModuleCode1, b.StartTime, b.EndTime
from DF a join DF b on a.ModuleCode1 < b.ModuleCode1 and
a.StartTime <= b.EndTime and
a.EndTime >= b.StartTime")
giving:
ModuleCode1 StartTime EndTime ModuleCode1 StartTime EndTime
1 11A 10:30 12:30 13C 11:45 12:45
2 11A 10:30 12:30 15B 09:40 10:40
3 13A 14:30 15:30 17B 14:00 15:00
Note: The input in reproducible form is:
Lines <- "ModuleCode1 ModuleName Lecturer StartTime EndTime Course
11A Hist1 Bob 10:30 12:30 Hist
13A Hist2 Bob 14:30 15:30 Hist
13C Hist3 Steve 11:45 12:45 Hist
15B Hist4 Bob 09:40 10:40 Hist
17B Hist5 Bob 14:00 15:00 Hist"
DF <- read.table(text = Lines, header = TRUE)
Upvotes: 1