Reputation: 16585
I have been able to use a lm
poly
-model to model and predict some timeseries data. However when I change to using a holt
model, I obtain an error in the R console.
Here is what I am trying to do:
library(ggplot2)
library(matrixStats)
library(forecast)
df_input <- read.csv("postprocessed.csv")
x <- df_input$time
y <- df_input$value
df <- data.frame(x, y)
#poly4model <- lm(y~poly(x, degree=4), data=df)
holtmodel <- holt(df$y) # might need df$value here ?
v <- seq(1, 44)
v2 <- seq(44, 55)
pdf("postprocessed_holts.pdf")
plot(df, xlim=c(0, 55))
##lines(v, predict(poly4model, data.frame(x=v)), col="blue", pch=20, lwd=3)
##lines(v2, predict(poly4model, data.frame(x=v2)), col="red", pch=20, lwd=3)
lines(v, predict(holtmodel, data.frame(x=v)), col="blue", pch=20, lwd=3)
lines(v2, predict(holtmodel, data.frame(x=v2)), col="red", pch=20, lwd=3)
dev.off()
This is the error which shows up
Error in xy.coords(x, y) : 'x' and 'y' lengths differ
I am a bit confused as to what x
and y
refer to here. The objects x
and y
which are in the Environment (R Studio Environment) both have length 44.
The code appears to error on both lines starting with lines
.
Here's a copy of the input data...
"","time","value"
"1",1,2.61066016308988
"2",2,3.41246054742996
"3",3,3.8608767964033
"4",4,4.28686048552237
"5",5,4.4923132964825
"6",6,4.50557049744317
"7",7,4.50944447661246
"8",8,4.51097373134893
"9",9,4.48788748823809
"10",10,4.34603985656981
"11",11,4.28677073671406
"12",12,4.20065901625172
"13",13,4.02514194962519
"14",14,3.91360194972916
"15",15,3.85865748409081
"16",16,3.81318053258601
"17",17,3.70380706527433
"18",18,3.61552922363713
"19",19,3.61405310598722
"20",20,3.64591327503384
"21",21,3.70234435835577
"22",22,3.73503970503372
"23",23,3.81003078640584
"24",24,3.88201196162666
"25",25,3.89872518158949
"26",26,3.97432743542362
"27",27,4.2523675144599
"28",28,4.34654855854847
"29",29,4.49276038902684
"30",30,4.67830892029687
"31",31,4.91896819673664
"32",32,5.04350767355202
"33",33,5.09073406942046
"34",34,5.18510849382162
"35",35,5.18353176529036
"36",36,5.2210776270173
"37",37,5.22643491929207
"38",38,5.11137006553725
"39",39,5.01052467981257
"40",40,5.0361056705898
"41",41,5.18149486951409
"42",42,5.36334869132276
"43",43,5.43053620818444
"44",44,5.60001072279525
I tried an alternative method as well. I noticed that the object holtmodel
contains two objects which might be useful. They are fitted
and mean
. As far as I can tell this is the fitted timeseries and the mean timeseries for the next 10 steps/predictions.
I tried plotting these objects with
lines(holtmodel$fitted, col="orange", lwd=2)
lines(holtmodel$mean, col="blue", lwd=2)
however the second of these fails to plot anything, despite no error being produced in the console. The first line plots an orange timeseries as expected.
Upvotes: 1
Views: 141
Reputation: 754
The objects you are trying to add as lines don't have the same length:
length(predict(holtmodel, data.frame(x=v)))
# 10
length(v)
# 44
length(predict(holtmodel, data.frame(x=v2)))
# 10
length(v2)
# 12
This means you can't add them as new lines.
Also, you can't really predict the same way you would with a linear regression by using say, older data as point to prepare the model. Exponential smoothing methods use historical data points to build future data points, you can't really display them for past events.
Also, you are not specifying the parameter for the number of periods you are trying to predict (h
), I'll let you refer to the documentation on the holt function. It is already a prediction of future events that is the output, so the use of predict()
on it doesn't change the result:
holt_predict <- predict(holtmodel)
length(setdiff(holt_predict, holtmodel))
# 0 which means they are the same objects
What you could do is use directly mean and fitted and plot them with lines, by also expanding the area to plot the chat with xlim
and ylim
to view the predicted values. You can directly plot holtmodel$fitted
and holtmodel$mean
on your chart, since they are time series objects:
plot(df, xlim=c(0, 60), ylim=c(2.5, 10))
lines(holtmodel$fitted, col="blue", pch=20, lwd=3)
lines(holtmodel$mean, col="red", pch=20, lwd=3)
And the result:
To save you the hassle of having to go through this kind of solution there are easier methods. Have you tried the autoplot
function included in the package forecast
? It is from ggplot2 and will give you what you want directly (unless you don't want the confidence intervals). It is very straightforward and will probably yield results close to what you want:
autoplot(holtmodel)
Upvotes: 1