Reputation: 10011
Say I have an excel file with format like this (to download from this link):
Note the first column is year and the first row is month.
I try to convert it to time series object then draw a seasonal plot using ggseasonplot
or ggplot2
.
df <- openxlsx::read.xlsx('dataset1.xlsx', sheet='Sheet1', colNames=TRUE, rowNames = TRUE)
# df <- t(df)
df <- ts(df, start = c(2008, 1), end=c(2021, 12), frequency = 12)
forecast::ggseasonplot(df, col=rainbow(12), year.labels=TRUE)
Output:
Error in data.frame(y = as.numeric(x), year = trunc(round(time(x), 8)), :
arguments imply differing number of rows: 2352, 168
How could I do that correctly using R? Thanks in advance.
References:
https://pkg.robjhyndman.com/forecast/reference/seasonplot.html
https://afit-r.github.io/ts_exploration
Upvotes: 0
Views: 623
Reputation: 16836
If it is a continuous time series, then you can drop the month
column and put all years into one column (and also remove the year after using melt
). Then, you can just specify your start year and month.
output <- ts(reshape::melt(df[,-1])[,2], start = c(2008, 1), frequency = 12)
forecast::ggseasonplot(output, col=rainbow(12), year.labels=TRUE)
Data
df <- structure(list(month = 1:12, `2008` = c(4466.7095, 3654.5805,
10195.65, 10093.13, 11854.13, 18171.78, 13724.1, 12759.61, 14951.02,
13318.36, 14425.07, 20553.11), `2009` = c(4597.063947, 5678.726053,
13286.21, 13520.3, 16438.02, 24578.03, 17833.66, 17052.78, 20191.81,
17533.16, 17924.44, 25504.42), `2010` = c(7034.610811, 5979.419189,
16778.65, 16950.07, 20615.55, 30689.08, 21818.87, 21131.49, 24871.84,
21686.52, 23141.76, 30717.07)), class = "data.frame", row.names = c(NA,
-12L))
Upvotes: 1