Reputation: 1
I have real-time data on NO2 that was collected with an NO2 sensor. The data has 48-hour NO2 logged per second. NO2 was measured for 48 hours in different homes. There are over 50,000 rows and three columns. Here a snapshot.
Homes Time ppb
Home 1 12:00 AM 6
Home 1 12:01 AM 6
Home 1 12:02 AM 7
Home 1 12:03 AM 6
Home 1 12:04 AM 7
Home 1 12:05 AM 9
Home 1 12:06 AM 8
Home 2 12:00 AM 2
Home 2 12:01 AM 3
Home 2 12:02 AM 4
Home 2 12:03 AM 1
Home 2 12:04 AM 6
Home 2 12:05 AM 4
Home 2 12:06 AM 8
I want to create a new dataset which will have hourly (12 am, 1 am, etc) averages of NO2 instead of per second. I want to make 1 am to be the average of 12:01 am - 1:00 am and continue with this trend.
I tried a number of methods, but I'm yet to figure how to do this. The group_by and summarize functions gave me averages of time (e.g. 1 am, 2 am) in the entire dataset, but I want hourly averages according to the home numbers.
I tried using a loop which didn't work:
Avg = data.frame(Sensor_analysis)
head(Avg)
Hourly = rep(0, 48)
for (i in 1:48) {
Hourly[i] = mean(Avg$ppb[60*(i-1)+1:60*i])
}
Hourly
Upvotes: 0
Views: 903
Reputation: 10203
Your question is really far, far away from the ideal of a fully reproducible example but try this on for size:
library(dplyr)
library(lubridate)
Sensor_analysis %>%
mutate(hour = ceiling_date(Time, 'hour')) %>%
group_by(Homes, hour) %>%
summarize(hourly = mean(ppb)) -> Hourly
Depending on how your Time
variable is stored you may first have to convert it either to POSIXct
or to hms
for this to work
Upvotes: 1