notacodr
notacodr

Reputation: 129

Subset time series by groups based on cutoff date data frame

I have a data frame with time series data for several different groups. I want to apply different start and end cutoff dates to each group within the original data frame.

Here's a sample data frame:

date <- seq(as.POSIXct("2014-07-21 17:00:00", tz= "GMT"), as.POSIXct("2014-09-11 24:00:00", tz= "GMT"), by="hour") 
group <- letters[1:4]                           
datereps <- rep(date, length(group))                  
attr(datereps, "tzone") <- "GMT"
sitereps <- rep(group, each = length(date))    
value  <- rnorm(length(datereps))
df <- data.frame(DateTime = datereps, Group = group, Value = value)  

and here's the data frame 'cut' of cutoff dates to use:

start <- c("2014-08-01 00:00:00 GMT", "2014-07-26 00:00:00 GMT", "2014-07-21 17:00:00 GMT", "2014-08-03 24:00:00 GMT")
end <- c("2014-09-11 24:00:00 GMT", "2014-09-01 24:00:00 GMT", "2014-09-07 24:00:00 GMT", "2014-09-11 24:00:00 GMT")
cut <- data.frame(Group = group, Start = as.POSIXct(start), End = as.POSIXct(end))

I can do it manually for each group, getting rid of the data I don't want on both ends of the time series using ![(),]:

df2 <- df[!(df$Group == "a" & df$DateTime > "2014-08-01 00:00:00 GMT" & df$DateTime < "2014-09-11 24:00:00 GMT"),]

But, how can I automate this?

Upvotes: 2

Views: 1272

Answers (1)

TARehman
TARehman

Reputation: 6749

Just merge the cuts into the data frame, and then create a new data frame using the new columns, like below. df3 contains the removed records, df4 contains the retained ones.

df2 <- merge(x = df,y = cut,by = "Group")
df3 <- df2[df2$DateTime <= df2$Start | df2$DateTime >= df2$End,]
df4 <- df2[!(df2$DateTime <= df2$Start | df2$DateTime >= df2$End),]

Upvotes: 1

Related Questions