Reputation: 21432
I have a dataframe with measurements taken at different intervals:
df <- data.frame(
A_aoi = c("C", "C", "C", "B"),
starttime_ms = c(49, 1981, 6847, 7180),
endtime_ms = c(1981, 6115, 7048, 10080)
)
Sometimes the intervals are completely contiguous, i.e., the starttime_ms
for the next measurement is the endtime_ms
of the prior measurement. More often, however, there are gaps between the intervals. I need to funnel-in rows into the df
whenever there is such a gap; the row should state when that gap starts and when it ends. The closest I have come so far to a solution is by detecting and measuring the duration of the gap:
library(dplyr)
df$gap <- ifelse(lag(df$starttime_ms,1) == df$endtime_ms,
NA,
lead(df$starttime_ms,1) - df$endtime_ms)
However that's still far from the desired output:
A_aoi starttime_ms endtime_ms
1 C 49 1981
2 C 1981 6115
3 NA 6115 6847
4 C 6847 7048
5 NA 7048 7180
6 B 7180 10080
Upvotes: 0
Views: 161
Reputation: 6509
You could use data.table
package as follows:
library(data.table)
unq <- sort(unique(setDT(df)[, c(starttime_ms, endtime_ms)]))
df[.(unq[-length(unq)], unq[-1]), on=c("starttime_ms", "endtime_ms")]
# A_aoi starttime_ms endtime_ms
# C 49 1981
# C 1981 6115
# <NA> 6115 6847
# C 6847 7048
# <NA> 7048 7180
# B 7180 10080
Upvotes: 1
Reputation: 8880
df <- data.frame(
A_aoi = c("C", "C", "C", "B"),
starttime_ms = c(49, 1981, 6847, 7180),
endtime_ms = c(1981, 6115, 7048, 10080)
)
df
#> A_aoi starttime_ms endtime_ms
#> 1 C 49 1981
#> 2 C 1981 6115
#> 3 C 6847 7048
#> 4 B 7180 10080
x <- sort(unique(unlist(df[-1])))
df_int <- data.frame(starttime_ms = x[-length(x)], endtime_ms = x[-1])
library(tidyverse)
left_join(df_int, df, by = c("starttime_ms", "endtime_ms")) %>%
relocate(A_aoi, everything())
#> A_aoi starttime_ms endtime_ms
#> 1 C 49 1981
#> 2 C 1981 6115
#> 3 <NA> 6115 6847
#> 4 C 6847 7048
#> 5 <NA> 7048 7180
#> 6 B 7180 10080
Created on 2021-03-03 by the reprex package (v1.0.0)
Upvotes: 1