Cem
Cem

Reputation: 148

Obtaining hourly means of time series with 10 minutes period

i have a dataframe which has 10 columns. First column is belongs to date in POSIXIt format and it has also date and time. Last nine columns belong to the variables. Data can be seen at the below.

str(combined) 
       'data.frame':    60118 obs. of  10 variables:
     $ date: POSIXlt, format: "2014-02-19 23:50:00" "2014-02-20 00:00:00" "2014-02-20 00:10:00" "2014-02-20 00:20:00" ...
     $ t1  : num  3895 3742 3900 3893 3900 ...
     $ t2  : num  3900 3829 3900 3900 3900 ...
     $ t3  : num  3900 3823 3900 3900 3900 ...
     $ t4  : num  3900 3838 3900 3900 3900 ...
     $ t5  : num  3888 3665 3899 3875 3900 ...
     $ t6  : num  3894 3732 3900 3892 3900 ...
     $ t7  : num  3900 3833 3900 3900 3900 ...
     $ t8  : num  3865 3600 3898 3857 3900 ...
     $ t9  : num  3785 3400 3876 3769 3888 ...

head(dput(combined))
                     date     t1     t2     t3     t4     t5     t6     t7     t8     t9
    1 2014-02-19 23:50:00 3895.3 3900.0 3900.0 3900.0 3887.7 3893.7 3900.0 3864.9 3785.4
    2 2014-02-20 00:00:00 3742.4 3829.3 3823.0 3838.5 3664.6 3731.7 3833.4 3600.5 3400.1
    3 2014-02-20 00:10:00 3900.0 3900.0 3900.0 3900.0 3899.4 3900.0 3900.0 3897.6 3876.3
    4 2014-02-20 00:20:00 3892.7 3900.0 3899.6 3900.0 3875.0 3892.5 3900.0 3857.3 3769.3
    5 2014-02-20 00:30:00 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3899.8 3887.7
    6 2014-02-20 00:40:00 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0

Period for the time is 10 minutes as you can see above. I just want to take average of 10 minutes data for obtaining hourly mean and i have used the command at the below.

hourly = aggregate(cbind(t1,t2,t3,t4,t5,t6,t7,t8,t9)~hour(date)+day(date), FUN=mean, na.rm=TRUE, data=combined)

But the resulted averages are strangely not true. You can see hourly data frame at the below.

      hour(date) day(date)       t1       t2       t3       t4       t5       t6       t7       t8       t9
    1          0         1 1268.661 1366.227 1332.414 1430.629 1312.232 1321.931 1365.395 1310.339 1268.267
    2          1         1 1362.630 1477.823 1425.480 1540.456 1411.321 1434.990 1491.545 1447.811 1391.001
    3          2         1 1267.275 1383.586 1349.843 1440.108 1321.554 1353.461 1416.698 1351.580 1294.068
    4          3         1 1107.732 1231.535 1226.436 1294.794 1172.119 1253.421 1349.420 1245.668 1191.140
    5          4         1 1069.490 1207.657 1206.873 1271.574 1163.368 1248.183 1330.854 1231.273 1182.910
    6          5         1 1229.902 1340.989 1345.615 1400.487 1300.392 1382.335 1497.015 1324.315 1347.073

Besides, it is not identified which date these data belong. I just want to get time and date columns together as a result.

Any help will be appreciated!

Thanks in advance.

Upvotes: 0

Views: 200

Answers (3)

Cem
Cem

Reputation: 148

I have my need with this command.

hourly= aggregate(list(turbine=combined[2:10]),by=list(date=cut(as.POSIXct(combined$date),"hour")),mean)

Thanks a lot.

Upvotes: 0

dmi3kno
dmi3kno

Reputation: 3045

Here's a quick solution with dplyr and lubridate::floor_date()

library(dplyr)
library(lubridate)

r1 %>% 
group_by(floor_date(date, "hour")) %>% 
summarize_if(is.double, mean)

# A tibble: 2 x 10
  `floor_date(as.POSIXct(date), "hour")`      t1      t2      t3     t4     t5      t6      t7      t8      t9
                                  <dttm>   <dbl>   <dbl>   <dbl>  <dbl>  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1                    2014-02-19 23:00:00 3895.30 3900.00 3900.00 3900.0 3887.7 3893.70 3900.00 3864.90 3785.40
2                    2014-02-20 00:00:00 3867.02 3885.86 3884.52 3887.7 3847.8 3864.84 3886.68 3831.04 3766.68

Upvotes: 1

CCurtis
CCurtis

Reputation: 1932

You might try data.table.

require("data.table")
r1<-read.table(text="                     date     time t1     t2     t3     t4     t5     t6     t7     t8     t9
    1 2014-02-19 23:50:00 3895.3 3900.0 3900.0 3900.0 3887.7 3893.7 3900.0 3864.9 3785.4
           2 2014-02-20 00:00:00 3742.4 3829.3 3823.0 3838.5 3664.6 3731.7 3833.4 3600.5 3400.1
           3 2014-02-20 00:10:00 3900.0 3900.0 3900.0 3900.0 3899.4 3900.0 3900.0 3897.6 3876.3
           4 2014-02-20 00:20:00 3892.7 3900.0 3899.6 3900.0 3875.0 3892.5 3900.0 3857.3 3769.3
           5 2014-02-20 00:30:00 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3899.8 3887.7
           6 2014-02-20 00:40:00 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0 3900.0",header=T,row.names=1)
r1[,1]<-as.POSIXct(strptime(r1[,1], "%Y-%m-%d"))
r1[,2]<-hour(as.POSIXct(strptime(r1[,2], "%H:%M:%S")))
setDT(r1)
r1[, lapply(.SD, mean), by=.(date,time)]

         date time      t1      t2      t3     t4     t5      t6
1: 2014-02-19   23 3895.30 3900.00 3900.00 3900.0 3887.7 3893.70
2: 2014-02-20    0 3867.02 3885.86 3884.52 3887.7 3847.8 3864.84
        t7      t8      t9
1: 3900.00 3864.90 3785.40
2: 3886.68 3831.04 3766.68

Upvotes: 1

Related Questions