Reputation: 43289
I have a data set of different events that occurred over different periods of time.
I would like to count the number of days per month each event spanned.
Here is the data frame.
dat = structure(list(event = structure(c(2L, 1L, 2L, 1L, 3L, 1L, 3L,
1L, 2L, 1L, 3L, 1L, 2L, 1L, 1L, 1L, 3L, 1L, 1L, 2L), .Label = c("Event1",
"Event2", "Event3"), class = "factor"), startDateTime = structure(c(1370995200,
1370649600, 1370476800, 1370304000, 1370131200, 1370131200, 1370044800,
1368316800, 1366848000, 1363824000, 1363737600, 1363046400, 1363046400,
1362873600, 1362009600, 1360627200, 1357776000, 1357689600, 1357689600,
1356739200), tzone = "UTC", class = c("POSIXct", "POSIXt")),
endDateTime = structure(c(1371686400, 1371686400, 1370908800,
1370476800, 1370649600, 1370131200, 1370476800, 1368489600,
1366934400, 1364083200, 1366502400, 1363219200, 1365897600,
1363219200, 1362182400, 1363132800, 1360454400, 1357776000,
1357862400, 1356998400), tzone = "UTC", class = c("POSIXct",
"POSIXt"))), .Names = c("event", "startDateTime", "endDateTime"
), row.names = c(NA, -20L), class = "data.frame")
I figured out from searching that I could use the zoo package to count the number of days in each month an event spanned, like so:
library(zoo)
table(as.yearmon(seq(dat$startDateTime[20], dat$endDateTime[20], "day")))
Dec 2012 Jan 2013
3 1
I would like to extend and generalise this so that I can apply it to the entire dataframe and count the number of days per month each event different event spanned. Is this something that could be achieved using lubridate?
Any pointers on this would be much appreciated.
Upvotes: 0
Views: 73
Reputation: 887621
You can try
library(data.table)
library(lubridate)
library(zoo)
setDT(dat)[, list(as.yearmon(seq(min(startDateTime), max(endDateTime),
by='day'))) , event][, .N, list(event, V1)]
Upvotes: 1
Reputation: 269905
Try lapply
over a row index using a function whose body is almost your code. It will produce a list with one component per row:
nr <- nrow(dat)
result <- lapply(1:nr, function(i)
table(as.yearmon(seq(dat$startDateTime[i], dat$endDateTime[i], "day")))
)
or to produce data.frame output:
nr <- nrow(dat)
L <- lapply(1:nr, function(i) {
tab <- table(as.yearmon(seq(dat$startDateTime[i], dat$endDateTime[i], "day")))
data.frame(Row = i, tab)
})
do.call("rbind", L)
Upvotes: 1