Reputation: 39
Consider a tibble A with 2 columns of which column 1 contains time stamps (POSIXct class) and an Interval object b, which I have created using lubridate::int_diff, containing 9 individual time intervals.
Using dplyr, I would like to add 9 new columns to the tibble A, indicating whether the time stamp of each row falls within any of the intervals. Put differently, I would like to use the function %within% and distribute the vector output of length 9 across the 9 new columns.
What is the most effective using the dplyr package?
Example:
library(lubridate)
library(dplyr)
A <- tibble(Ts = ymd_hms(c("2018-01-01 15:12:04",
"2018-01-02 00:14:06","2018-01-05 12:00:00")),
P = c(1:3))
ts.start <- ymd_hms("2018-01-01 15:00:00")
ts.end <- ymd_hms("2018-01-02 15:30:00")
ts <- c(ts.start,sort(ts.end -
minutes(cumsum(c(15,15,30,30,60,60,60,60)))),ts.end)
b <- int_diff(ts)
# Applying %within" to the first element works
(A[[1,1]] %within% b) + 0
# The line with error.
mutate(A,New = Ts %within% b )
The last line produces an error as expected and would like to know how can define new variables based on applying a function with vector output on a variable column.
Upvotes: 1
Views: 576
Reputation: 70653
How about iterating through each element of Ts
, checking within which interval it falls and append this to A
?
# iterate through each element and output a list of matches for each element which
# corresponds to a row
out <- sapply(A$Ts, FUN = function(x, y) x %within% y, y = b, simplify = FALSE)
# append result to original data
cbind(A, do.call(rbind, out))
Ts P 1 2 3 4 5 6 7 8 9
1 2018-01-01 15:12:04 1 TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
2 2018-01-02 00:14:06 2 TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
3 2018-01-05 12:00:00 3 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
You could use a zucchini plot (just made that up) to visualize into which interval the point belongs to.
library(ggplot2)
xy <- data.frame(id = 1:length(b), start = int_start(b), end = int_end(b))
head(xy)
ggplot(xy) +
theme_bw() +
scale_fill_gradient(low = "#324706", high = "#aeb776") +
geom_rect(aes(xmin = start, xmax = end, ymin = 0, ymax = nrow(A) + 0.5, fill = id),
color = "white") +
geom_hline(yintercept = A$P + 0.5, color = "grey") +
geom_point(data = A, aes(x = Ts, y = P), color = "white", size = 2) +
geom_point(data = A, aes(x = Ts, y = P), color = "black", size = 2, shape = 1)
Upvotes: 2