Reputation: 161
i am new to R and I have a dataframe like this:
EndDate EndId EndTime
1 02/01/2013 251 13:32
2 03/01/2013 251 17:19
3 04/01/2013 251 16:26
4 04/01/2013 368 08:36
5 05/01/2013 386 16:58
6 01/01/2013 329 11:51
How to select rows of the dataframe which lies between 03/01/2013 and 05/01/2013? (or for a certain time period?) Thanks
Upvotes: 2
Views: 9485
Reputation: 35324
First of all, please please don't use ambiguous date formats. All 6 of your date strings are ambiguous; they could be dd/mm/yyyy or mm/dd/yyyy. If you have control over the generation of the input data, please generate it in yyyy-mm-dd (ISO 8601) form. It hurts my eyes to see ambiguous date formats.
Also, R has a builtin Date class, which is perfect for your purposes. Your EndDate column is not of Date type, which is discernible because R always prints Dates in ISO 8601 format. You can convert the column to Date type directly with this:
df$EndDate <- as.Date(df$EndDate, '%m/%d/%Y' );
(I've assumed mm/dd/yyyy format.)
Then you can use vectorized comparison, logical, and indexing operations to extract the subset of the data.frame that matches your criteria:
df[df$EndDate>=as.Date('2013-03-01') & df$EndDate<=as.Date('2013-05-01'),];
Upvotes: 3
Reputation: 59425
My inclination would be to turn this into an xts
(eXtensible Time Series) object and use time-series indexing on that.
library(xts)
df.xts <-xts(df["EndId"],order.by=as.POSIXct(paste(df$EndDate,df$EndTime),format="%m/%d/%Y %H:%M"))
df.xts["2013-03-01::2013-05-01"]
# EndId
# 2013-03-01 17:19:00 251
# 2013-04-01 08:36:00 368
# 2013-04-01 16:26:00 251
# 2013-05-01 16:58:00 386
Upvotes: 5