Sergei Lukyanov
Sergei Lukyanov

Reputation: 31

How to get non-overlapping dates?

I would like to know how can I vectorize this code.

dates = list(as.Date(c("2000-02-08", "2000-02-11")),
     as.Date(c("2000-03-02", "2000-03-07")),
     as.Date(c("2000-03-02", "2000-03-07")),
     as.Date(c("2000-03-03", "2000-03-07")),
     as.Date(c("2000-03-16", "2000-03-30")),
     as.Date(c("2000-03-16")))

i = 2
while(i <= length(dates))
{
     if(dates[[i]][1] < dates[[i-1]][2])
     {
        dates[[i]] = NULL
        i = i-1
    }

  i = i+1
}

I would like to get only those couple of dates that don't intersect.

Date1 = as.Date(c("2000-03-02", "2000-03-07"))
Date2 = as.Date(c("2000-03-03", "2000-03-07"))

For Example, if Date2 is included in range of Date1 then we remove Date2.

Upvotes: 1

Views: 165

Answers (2)

Tensibai
Tensibai

Reputation: 15784

With foverlaps from package data.table:

dates = list(as.Date(c("2000-02-08", "2000-02-11")),
            as.Date(c("2000-03-02", "2000-03-07")),
            as.Date(c("2000-03-02", "2000-03-05")),
            as.Date(c("2000-03-09", "2000-03-15")),
            as.Date(c("2000-03-16", "2000-03-30")),
            as.Date(c("2000-03-16")))

dt<-as.data.table(do.call(rbind,dates))
setkey(dt)
# Get id of the ranges within others
tmp <- foverlaps(dt,dt,which=T,type="within")[,xid]
# summarize this
t<-table(tmp)

# Filter for ranges appearing only once, hence not included in another one.
res <- dt[ as.integer(names(t[t==1])) , ]
# not aboslutely necessary, but it's to retrieve date objects which were converted by the rbind call.
res[, `:=`( V1=as.Date(V1,origin="1970-01-01"), V2=as.Date(V2, origin="1970-01-01"))][]  

Output (slightly different as I added cases):

           V1         V2
1: 2000-02-08 2000-02-11
2: 2000-03-02 2000-03-07
3: 2000-03-09 2000-03-15
4: 2000-03-16 2000-03-30

In case you wish to exclude any intersection, set type="any" in the foverlaps call to get this output:

           V1         V2
1: 2000-02-08 2000-02-11
2: 2000-03-09 2000-03-15

Upvotes: 1

Wietze314
Wietze314

Reputation: 6020

Depends in what direction you are looking. In my example you look if any of the following rows of data is overlapping (I just look at the start date yet, but you can expand that).

dates = list(as.Date(c("2000-02-08", "2000-02-11")),
             as.Date(c("2000-03-02", "2000-03-07")),
             as.Date(c("2000-03-02", "2000-03-07")),
             as.Date(c("2000-03-03", "2000-03-07")),
             as.Date(c("2000-03-16", "2000-03-30")),
             as.Date(c("2000-03-16")))

m <- do.call(rbind,dates)

rem <- sapply(seq_along(m[,1]),function(x){any(which(
  m[x,1]<m[,2] & m[x,1]>=m[,1])>x)})

m[!rem,]

Upvotes: 0

Related Questions