Reputation: 4582
I have a dataset with lots of observations that has 3 different fields:
Consumption, Date, Hour
100, 17/12/2010, 00
120, 17/12/2010, 01
105, 17/12/2010, 02
110, 17/12/2010, 03
...
140, 17/12/2010, 23
130, 18/12/2010, 00
145, 18/12/2010, 01
...
What I want to achieve is create a new dataset with these 3 fields:
Consumption, Date, Period
103.5, 17/12/2010, Night
104.9, 17/12/2010, Morning
104.9, 17/12/2010, Afternoon
so that all the observation of each day between hours 00-08 belong to Night, 08-16 to Morning and 16-23 to Afternoon. The field Consumption of my new dataset will store the mean of the values of observations between the corresponding hours. So finally, I want to have only 3 observations per day instead of 24 that I got now. I think I can achieve this using the aggregate() function but I don't know to use it correctly to achieve it. Am I looking at the right direction? How can this be done?
Upvotes: 0
Views: 98
Reputation: 206177
First, it would be helpful to have more complete sample data
set.seed(15)
dd<-data.frame(
Consumption=rpois(10*12,100),
Date=rep(seq(as.Date("2012-01-01"), as.Date("2012-01-10"), by="1 day"), each=12),
Hour=rep(seq(1,24, by=2), 10)
)
the first step would be to create a grouping vairable for the hours. You can do that easily with cut()
dd$Part <- cut(dd$Hour,
breaks=c(0,8, 16, 23),
labels=c("Night","Morning","Afternoon"),
include.lowest=T)
Then you could use aggregate()
aggregate(Consumption~Part+Date,dd, mean)
which returns
part Date Consumption
1 Night 2012-01-01 94.75
2 Morning 2012-01-01 103.00
3 Afternoon 2012-01-01 101.75
4 Night 2012-01-02 105.50
5 Morning 2012-01-02 95.00
...
Upvotes: 3