Reputation: 51
I have a data frame with two columns, time and flow. The time interval for the time series is 15 minutes and I want to cut this time series so that the output time series has consistent one hour time intervals and the flow value from that hourly time stamp from the original data. How do I extract the hourly data?
Input:
structure(list(t = structure(c(1104555600, 1104556500, 1104557400,
1104558300, 1104559200, 1104560100, 1104561000, 1104561900, 1104562800
), class = c("POSIXct", "POSIXt"), tzone = "EST"), flow = c(18,
18, 18, 18.125, 18.125, 18.125, 18.125, 18.125, 18.125)), .Names = c("t", "flow"), row.names = c(NA, 9L), class = "data.frame")
And for output I would want something like
time flow
2005-01-01 00:00:00 18.000
2005-01-01 01:00:00 18.125
2005-01-01 02:00:00 18.125
Upvotes: 1
Views: 2107
Reputation: 21425
You can use cut
to get the hour in which each t
variable is, and then just take the first element of every cut
group. If df
is your dataframe:
aggregate(df, list(cut(df$t,breaks="hour")), FUN=head, 1)[,-2]
# Group.1 flow
# 2005-01-01 00:00:00 18.000
# 2005-01-01 01:00:00 18.125
# 2005-01-01 02:00:00 18.125
Upvotes: 3
Reputation: 2989
You don't give any example, but from what I understand you simply want to keep every forth row.
In a data set with
time<- c(10,11,12,13,14,15,16,17,18,19)
flow<- c(3,4,5,6,7,8,9,10,11,12)
d <- data.frame(time,flow)
1 10 3
2 11 4
3 12 5
4 13 6
5 14 7
6 15 8
7 16 9
8 17 10
9 18 11
10 19 12
with
> d[seq(1, NROW(d), by = 4),]
you only keep every fourth row.
time flow
1 10 3
5 14 7
9 18 11
Upvotes: 0