Number of overlaping datetime inside same table (R)

Question

I have a table of about 50 000 rows, with four columns.

ID     Arrival             Departure             Gender

1   10/04/2015 23:14    11/04/2015 00:21           F
1   11/04/2015 07:59    11/04/2015 08:08           F
3   10/04/2017 21:53    30/03/2017 23:37           M
3   31/03/2017 07:09    31/03/2017 07:57           M
3   01/04/2017 01:32    01/04/2017 01:35           M
3   01/04/2017 13:09    01/04/2017 14:23           M
6   10/04/2015 21:31    10/04/2015 23:17           F
6   10/04/2015 23:48    11/04/2015 00:05           F
6   01/04/2016 21:45    01/04/2016 22:48           F
6   02/04/2016 04:54    02/04/2016 07:38           F
6   04/04/2016 18:41    04/04/2016 22:48           F
10  10/04/2015 22:39    11/04/2015 00:42           M
10  13/04/2015 02:57    13/04/2015 03:07           M
10  31/03/2016 22:29    01/04/2016 08:39           M
10  01/04/2016 18:49    01/04/2016 19:44           M
10  01/04/2016 22:28    02/04/2016 00:31           M
10  05/04/2017 09:27    05/04/2017 09:28           M 
10  06/04/2017 15:12    06/04/2017 15:43           M

This is a very small representation of the table. What I want to find out is, at the same time as each entry, how many others were present and then separate them by gender. So, say for example that at the time as the first presence of person with ID 1, person with ID 6 was present and person with ID 10 was present twice in the same interval. That would mean that at the same time, 2 other overlaps occurred. This also means that person with ID 1 has overlapped with 1 Male and 1 Female.

So its result should look like:

ID           Arrival            Departure         Males encountered        Females encountered
1       10/04/2015 23:14    11/04/2015 00:21             1                          1

How would I be able to calculate this? I have tried to work with foverlaps and have managed to solve this with Excel, but I would want to do it in R.

PavoDive · Accepted Answer

Here is a data.table solution using foverlaps.

First, notice that there's an error in your data:

ID           Arrival           Departure      Gender
3   10/04/2017 21:53    30/03/2017 23:37           M

The user arrived almost one month after he actually left. I needed to get rid of that data in order for foverlaps to run.

library(data.table)

dt <- data.table(df)
dt <- dt[Departure > Arrival, ]  # filter wrong cases

setkey(dt, "Arrival", "Departure")  # prepare for foverlaps
dt2 <- copy(dt)  # use a different dt, inherits the key

run foverlaps and then

filter (leave only) the cases where arrival of second person is before than ID and same user-cases.
Add a variable where we count the male simultaneous guests and
a variable where we count the female simultaneous guests, all grouped by ID and arrival

.

simultaneous <- foverlaps(dt, dt2)[i.Arrival <= Arrival & ID != i.ID,
                                       .(malesEncountered = sum(i.Gender == "M"),
                                         femalesEncountered = sum(i.Gender == "F")), 
                                       by = .(ID, Arrival)]

Join the findings of the previous command with our original table on ID and arrival

result <- simultaneous[dt, on = .(ID, Arrival)]

: Convert to zero the NAs in `malesEncountered` and `femalesEncountered`:

result[is.na(malesEncountered), malesEncountered := 0][
                 is.na(femalesEncountered), femalesEncountered := o]

set the column order to something nicer

setcolorder(result, c(1, 2, 5, 6, 3, 4))[]

Number of overlaping datetime inside same table (R)

Answers (2)

run foverlaps and then

Join the findings of the previous command with our original table on ID and arrival

<EDIT>: Convert to zero the NAs in `malesEncountered` and `femalesEncountered`: </EDIT>

set the column order to something nicer

Related Questions

Number of overlaping datetime inside same table (R)

Answers (2)

run foverlaps and then

Join the findings of the previous command with our original table on ID and arrival

<EDIT>: Convert to zero the NAs in malesEncountered and femalesEncountered: </EDIT>

set the column order to something nicer

Related Questions

<EDIT>: Convert to zero the NAs in `malesEncountered` and `femalesEncountered`: </EDIT>