Reputation:

lubridate - group value by hour and calculate mean

I have got the following data.frame:

df = read.csv(text = 'date,      no,      no2,      nox,
              2015-10-16 00:00:00, 1.10979, 14.50249, 16.20413,
              2015-10-16 01:00:00, 1.73032, 13.60122, 16.25434,
              2015-10-17 00:00:00, 1.30592, 11.20056, 13.20294,
              2015-10-17 01:00:00, 2.05711, 11.34973, 14.50392,
              2015-10-18 00:00:00, 4.14603, 16.79844, 23.15559,
              2015-10-18 01:00:00, 7.73731, 24.74488, 36.60860')
df = df[,-c(5)]

I need to calculate the mean for each hour of the three days for all the variables.

I tried this but it doesn't work:

data_0 = df[hours(df$date) %in% 0,]
data_1 = df[hours(df$date) %in% 1,]

.....

Any suggestion?

The output should be a dataframe where for each variable I have the mean for each hours in the three days time frame.

> class(df$date)
[1] "POSIXlt" "POSIXt"

Upvotes: 1

Answers (3)

Hakki

Reputation: 1472

Here is tidyverse example, this should work. This way repeating is pretty minimal.

library(lubridate)
library(tidyverse)

    df = read.csv(text = 'date,      no,      no2,      nox,
              2015-10-16 00:00:00, 1.10979, 14.50249, 16.20413,
              2015-10-16 01:00:00, 1.73032, 13.60122, 16.25434,
              2015-10-17 00:00:00, 1.30592, 11.20056, 13.20294,
              2015-10-17 01:00:00, 2.05711, 11.34973, 14.50392,
              2015-10-18 00:00:00, 4.14603, 16.79844, 23.15559,
              2015-10-18 01:00:00, 7.73731, 24.74488, 36.60860')
df = df[,-c(5)]

df %>% 
  mutate(date = ymd_hms(date),
         hour = hour(date)) %>% 
  group_by(hour) %>% 
  summarise(mean_no = mean(no),
            mean_no2 = mean(no2))

Upvotes: 1

user7109363

Reputation:

#1 create column with hour
df$hour <- as.POSIXlt(df$date)$hour

#2 calculate no (col 2) mean for each group of hours
data_no = aggregate(df$no, by=list(hour=df$hour), FUN=mean)

#3 rename cols
colnames(data_no) = c('hour', 'mean')

repeat points 2 and 3 for all the variables of interest.

Upvotes: 0

Henk

Reputation: 3656

as your dataset is not provided in a reproducible format, I am using a dataset from library(openair).

library(data.table)

data(mydata, package = "openair")

melt(setDT(mydata), id.var = "date")[, .(
  avg = mean(value, na.rm = T)
), by = .(hour(date), variable)]

Upvotes: 0

lubridate - group value by hour and calculate mean

Answers (3)

Related Questions