Reputation: 148
i have a dataframe which has 10 columns. First column is belongs to date in POSIXIt format and it has also date and time. Last nine columns belong to the variables. Data can be seen at the below.
str(combined)
'data.frame': 60118 obs. of 10 variables:
$ date: POSIXlt, format: "2014-02-19 23:50:00" "2014-02-20 00:00:00" "2014-02-20 00:10:00" "2014-02-20 00:20:00" ...
$ t1 : num 3895 3742 3900 3893 3900 ...
$ t2 : num 3900 3829 3900 3900 3900 ...
$ t3 : num 3900 3823 3900 3900 3900 ...
$ t4 : num 3900 3838 3900 3900 3900 ...
$ t5 : num 3888 3665 3899 3875 3900 ...
$ t6 : num 3894 3732 3900 3892 3900 ...
$ t7 : num 3900 3833 3900 3900 3900 ...
$ t8 : num 3865 3600 3898 3857 3900 ...
$ t9 : num 3785 3400 3876 3769 3888 ...
head(dput(combined))
date t1 t2 t3 t4 t5 t6 t7 t8 t9
1 2014-02-19 23:50:00 3895.3 3900.0 3900.0 3900.0 3887.7 3893.7 3900.0 3864.9 3785.4
2 2014-02-20 00:00:00 3742.4 3829.3 3823.0 3838.5 3664.6 3731.7 3833.4 3600.5 3400.1
3 2014-02-20 00:10:00 3900.0 3900.0 3900.0 3900.0 3899.4 3900.0 3900.0 3897.6 3876.3
4 2014-02-20 00:20:00 3892.7 3900.0 3899.6 3900.0 3875.0 3892.5 3900.0 3857.3 3769.3
5 2014-02-20 00:30:00 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3899.8 3887.7
6 2014-02-20 00:40:00 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0
Period for the time is 10 minutes as you can see above. I just want to take average of 10 minutes data for obtaining hourly mean and i have used the command at the below.
hourly = aggregate(cbind(t1,t2,t3,t4,t5,t6,t7,t8,t9)~hour(date)+day(date), FUN=mean, na.rm=TRUE, data=combined)
But the resulted averages are strangely not true. You can see hourly data frame at the below.
hour(date) day(date) t1 t2 t3 t4 t5 t6 t7 t8 t9
1 0 1 1268.661 1366.227 1332.414 1430.629 1312.232 1321.931 1365.395 1310.339 1268.267
2 1 1 1362.630 1477.823 1425.480 1540.456 1411.321 1434.990 1491.545 1447.811 1391.001
3 2 1 1267.275 1383.586 1349.843 1440.108 1321.554 1353.461 1416.698 1351.580 1294.068
4 3 1 1107.732 1231.535 1226.436 1294.794 1172.119 1253.421 1349.420 1245.668 1191.140
5 4 1 1069.490 1207.657 1206.873 1271.574 1163.368 1248.183 1330.854 1231.273 1182.910
6 5 1 1229.902 1340.989 1345.615 1400.487 1300.392 1382.335 1497.015 1324.315 1347.073
Besides, it is not identified which date these data belong. I just want to get time and date columns together as a result.
Any help will be appreciated!
Thanks in advance.
Upvotes: 0
Views: 200
Reputation: 148
I have my need with this command.
hourly= aggregate(list(turbine=combined[2:10]),by=list(date=cut(as.POSIXct(combined$date),"hour")),mean)
Thanks a lot.
Upvotes: 0
Reputation: 3045
Here's a quick solution with dplyr
and lubridate::floor_date()
library(dplyr)
library(lubridate)
r1 %>%
group_by(floor_date(date, "hour")) %>%
summarize_if(is.double, mean)
# A tibble: 2 x 10
`floor_date(as.POSIXct(date), "hour")` t1 t2 t3 t4 t5 t6 t7 t8 t9
<dttm> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2014-02-19 23:00:00 3895.30 3900.00 3900.00 3900.0 3887.7 3893.70 3900.00 3864.90 3785.40
2 2014-02-20 00:00:00 3867.02 3885.86 3884.52 3887.7 3847.8 3864.84 3886.68 3831.04 3766.68
Upvotes: 1
Reputation: 1932
You might try data.table
.
require("data.table")
r1<-read.table(text=" date time t1 t2 t3 t4 t5 t6 t7 t8 t9
1 2014-02-19 23:50:00 3895.3 3900.0 3900.0 3900.0 3887.7 3893.7 3900.0 3864.9 3785.4
2 2014-02-20 00:00:00 3742.4 3829.3 3823.0 3838.5 3664.6 3731.7 3833.4 3600.5 3400.1
3 2014-02-20 00:10:00 3900.0 3900.0 3900.0 3900.0 3899.4 3900.0 3900.0 3897.6 3876.3
4 2014-02-20 00:20:00 3892.7 3900.0 3899.6 3900.0 3875.0 3892.5 3900.0 3857.3 3769.3
5 2014-02-20 00:30:00 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3899.8 3887.7
6 2014-02-20 00:40:00 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0",header=T,row.names=1)
r1[,1]<-as.POSIXct(strptime(r1[,1], "%Y-%m-%d"))
r1[,2]<-hour(as.POSIXct(strptime(r1[,2], "%H:%M:%S")))
setDT(r1)
r1[, lapply(.SD, mean), by=.(date,time)]
date time t1 t2 t3 t4 t5 t6
1: 2014-02-19 23 3895.30 3900.00 3900.00 3900.0 3887.7 3893.70
2: 2014-02-20 0 3867.02 3885.86 3884.52 3887.7 3847.8 3864.84
t7 t8 t9
1: 3900.00 3864.90 3785.40
2: 3886.68 3831.04 3766.68
Upvotes: 1