Hala
Hala

Reputation: 41

How to calculate the moving average base on date and time in R

I uploaded my data. https://filebin.net/a29fn87b8wpfnos0/Plume_2.csv?t=iouc5vg7

It looks like this in a csv file format enter image description here

I tried to look for a proper answer that suits my data. I couldn't find it, it took me about a month trying by myself to solve it.

First I need to do a moving average for:

for each PM2.5, PM10, NO2

However, I can't do that manually using this type of code:

Plume_2$PM2.5_30min_ <- TTR ::SMA(Plume_2$pm2.5, n=31)
Plume_2$PM2.5_1hour_ <- TTR ::SMA(Plume_2$pm2.5, n=61)
Plume_2$PM2.5_1day_ <- TTR ::SMA(Plume_2$pm2.5, n=1441)
Plume_2$PM2.5_1week_ <- TTR ::SMA(Plume_2$pm2.5, n=10080)

with these codes, the n values don't fit with the date I have.

Also used this code and seems the average not working well.

library(runner)
dates = Plume_2$timestamp
value = Plume_2$PM2.5_Plume2

Plume_2$MA <-  mean_run(x = value, k = 7, lag = 1, idx = as.Date(dates))

The final output will be a plot graph containing those different moving averages.

Can anyone help me, please?

Upvotes: 0

Views: 902

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388797

Using zoo's rollmeanr function along with across from dplyr can help you with this.

library(dplyr)
library(zoo)

df <- read.csv('https://filebin.net/a29fn87b8wpfnos0/Plume_2.csv?t=up70ngy3')


df %>%
  mutate(across(PM2.5_Plume2:NO2_Plume2, 
               list(avg_30min = ~rollmeanr(.x, 30, fill = NA), 
                    avg_hour =  ~rollmeanr(.x, 60, fill = NA), 
                    avg_day =  ~rollmeanr(.x, 1440, fill = NA), 
                    avg_week =  ~rollmeanr(.x, 10080, fill = NA)))) -> result

result

Upvotes: 1

Peace Wang
Peace Wang

Reputation: 2419

I hope the follwing is a satisfying solution.

library(data.table)
dt <- fread("https://filebin.net/a29fn87b8wpfnos0/Plume_2.csv?t=phgmlykh")
dt[,.(timestamp,
      PM2.5_30min_mean = frollmean(PM2.5_Plume2,31),
      PM2.5_1hour_mean = frollmean(PM2.5_Plume2,61),
      PM2.5_1day_mean = frollmean(PM2.5_Plume2, 1441),
      PM2.5_1week_mean = frollmean(PM2.5_Plume2,10080))]

The result is shown as enter image description here

Then I want to plot the result using ggplot. Here I choose PM2.5_30min as an example.

library(lubriate) # turn timestamp into POSIXct format with dmy_hm function
ggplot(dt2, aes(dmy_hm(timestamp), PM2.5_1hour_mean,na.rm = TRUE)) +
  geom_line() + 
  scale_x_datetime()

Upvotes: 1

Related Questions