Reputation: 13
I'm a little out of my depth with mapply() and do.call...
So I have two lists like so:
ID START END
a1 1/1/2000 1/30/2000
a2 5/4/2000 3/1/2002
a3 5/8/2004 8/7/2005
a4 1/3/2012 5/7/2015
ID START END
b1 5/1/2000 1/30/2020
b2 6/4/2007 3/1/2008
b3 5/8/2014 8/7/2015
b4 1/3/1999 5/7/2019
Many of the dates overlap with each other, and that's what I'm trying to identify. I'm trying to create a column for each entry on the second list onto the first that says whether or not the date ranges overlap...
ID START END b1 b2 b3 b4
a1 1/1/2000 1/30/2000 0 0 0 1
a2 5/4/2000 3/1/2002 1 0 0 1
a3 5/8/2004 8/7/2005 1 0 0 1
a4 1/3/2012 5/7/2015 1 0 1 1
where a 0 represents un-overlapping events, and 1 represents overlap.
My effort so far has been to use dplyr mutate in a function with multiple variables. Then I'm trying to use mapply to feed the whole lists in as those variables...
builder <- function(id,start,finish){
resource_const_all <- resource_const %>%
mutate(id = ifelse(start > START & start < END,"1",
ifelse(finish > START & finish < END, "1",
ifelse(start < START & finish > END, "1", "0"))))
}
###if the start date falls in the date range, it returns 1.
###if the end date falls in the date range, it returns 1.
###if the start date is before the date range and the end date is after, it
###returns 1.
###Else the dates don't overlap, returns 0.
builder_output <- mapply(builder,id_list,start_list,end_list))
Thanks for any help!
Upvotes: 0
Views: 87
Reputation: 269654
Assume the data shown reproducibly in the Note at the end where we ensure that the START
and END
columns are of Date
class. Then use outer
as shown.
Note that overlap
is a generic test and overlapAB
makes it specific to A
and B
.
No packages are used.
overlap <- function(start1, end1, start2, end2) {
(start1 >= start2 & start1 <= end2) | (start2 >= start1 & start2 <= end1)
}
overlapAB <- function(idA, idB) {
i <- match(idA, A$ID)
j <- match(idB, B$ID)
overlap(A$START[i], A$END[i], B$START[j], B$END[j])
}
cbind(A, +outer(A$ID, B$ID, overlapAB))
giving:
ID START END b1 b2 b3 b4
1 a1 2000-01-01 2000-01-30 0 0 0 1
2 a2 2000-05-04 2002-03-01 1 0 0 1
3 a3 2004-05-08 2005-08-07 1 0 0 1
4 a4 2012-01-03 2015-05-07 1 0 1 1
LinesA <- "ID START END
a1 1/1/2000 1/30/2000
a2 5/4/2000 3/1/2002
a3 5/8/2004 8/7/2005
a4 1/3/2012 5/7/2015"
LinesB <- "ID START END
b1 5/1/2000 1/30/2020
b2 6/4/2007 3/1/2008
b3 5/8/2014 8/7/2015
b4 1/3/1999 5/7/2019"
fmt <- "%m/%d/%Y"
A <- read.table(text = LinesA, header = TRUE, as.is = TRUE)
A$START <- as.Date(A$START, fmt)
A$END <- as.Date(A$END, fmt)
B <- read.table(text = LinesB, header = TRUE, as.is = TRUE)
B$START <- as.Date(B$START, fmt)
B$END <- as.Date(B$END, fmt)
Upvotes: 1