Reputation: 107
My dataframe has data in the following format from 1983-2008:
Year, temp,
1983, .109,
1984, .091,
1985, -.10,
1986, .051,
1987, -.071,
1988, .101,
1989, .003,
1990, -.051,
1991, -.110,
1992, .134,
1993, .091,
1994, .122,
1995, .101,
1996, .087,
1997, .075,
Is there a way to plot this data in a scatter plot so that each plot is the average value from a 5 year frame. For example, the first plot's x value would be 1983-1987 and the y value would be the average temp from those years. I have looked into the aggregate function with mean as the 3rd parameter but a date range is not supported in this function.
Upvotes: 0
Views: 87
Reputation: 19088
Use a rolling mean function. This is a base R solution:
vec is your object, len is the window size and prtl prints a partial mean.
rollmean <- function(vec, len, prtl = FALSE) {
if (len > length(vec)) {
stop(paste("Choose lower range,", len, ">", length(vec)))
}
else {
if (prtl == T) {
sapply(1:length(vec), function(i) {
if (i <= len) {
mean(vec[1:i])
}
else {
mean(vec[(i - (len - 1)):i])
}
})
}
else {
sapply(1:length(vec), function(i) {
if (i - (len - 1) > 0) {
mean(vec[(i - (len - 1)):i])
}
else {
NA
}
})
}
}
}
To get the data, use it like this:
dat
Year temp
1 1983 0.109
2 1984 0.091
3 1985 -0.100
...
mydat <- setNames( data.frame( paste( rollmean(dat$Year,5) - 2,
rollmean(dat$Year,5) + 2, sep="-" ),
rollmean(dat$temp,5) ), colnames(dat) )
mydat
Year temp
1 NA-NA NA
2 NA-NA NA
3 NA-NA NA
4 NA-NA NA
5 1983-1987 0.0160
6 1984-1988 0.0144
7 1985-1989 -0.0032
8 1986-1990 0.0066
9 1987-1991 -0.0256
10 1988-1992 0.0154
11 1989-1993 0.0134
12 1990-1994 0.0372
13 1991-1995 0.0676
14 1992-1996 0.1070
15 1993-1997 0.0952
Plotting the data, e.g. as a barplot (use geom_point( aes( Year, temp ))
for a scatter plot):
require(ggplot2)
ggplot( mydat ) + geom_bar( aes( Year, temp, fill=Year ), stat="identity" ) +
theme(axis.text = element_text(size = 6))
Omit the NAs simply by using mydat[!is.na(mydat[,2]),]
Upvotes: 1
Reputation: 388982
We can use zoo
's rollmean
function to calculate rolling average and ggplot2
to plot.
library(dplyr)
library(ggplot2)
df %>%
mutate(temp_rolling_avg = zoo::rollmean(temp, 5, align = 'left', fill = NA),
Year_label = paste(Year, lead(Year, 4), sep = '-')) %>%
filter(!is.na(temp_rolling_avg)) %>%
ggplot(aes(Year_label, temp_rolling_avg)) +
geom_col()
Upvotes: 2
Reputation: 19544
You can use embed
and rowMeans
:
x = 1985 : 1995
y = rowMeans(embed(df$temp,5))
plot(x, y)
Upvotes: 1