user2138149
user2138149

Reputation: 16585

R: Holt Model. Unable to plot timeseries prediction (predict)

I have been able to use a lm poly-model to model and predict some timeseries data. However when I change to using a holt model, I obtain an error in the R console.

Here is what I am trying to do:

library(ggplot2)
library(matrixStats)
library(forecast)

df_input <- read.csv("postprocessed.csv")

x <- df_input$time
y <- df_input$value
df <- data.frame(x, y)

#poly4model <- lm(y~poly(x, degree=4), data=df)
holtmodel <- holt(df$y) # might need df$value here ?
v <- seq(1, 44)
v2 <- seq(44, 55)
    
pdf("postprocessed_holts.pdf")
plot(df, xlim=c(0, 55))
##lines(v, predict(poly4model, data.frame(x=v)), col="blue", pch=20, lwd=3)
##lines(v2, predict(poly4model, data.frame(x=v2)), col="red", pch=20, lwd=3)
lines(v, predict(holtmodel, data.frame(x=v)), col="blue", pch=20, lwd=3)
lines(v2, predict(holtmodel, data.frame(x=v2)), col="red", pch=20, lwd=3)
dev.off()

This is the error which shows up

Error in xy.coords(x, y) : 'x' and 'y' lengths differ

I am a bit confused as to what x and y refer to here. The objects x and y which are in the Environment (R Studio Environment) both have length 44.

The code appears to error on both lines starting with lines.

Here's a copy of the input data...

"","time","value"
"1",1,2.61066016308988
"2",2,3.41246054742996
"3",3,3.8608767964033
"4",4,4.28686048552237
"5",5,4.4923132964825
"6",6,4.50557049744317
"7",7,4.50944447661246
"8",8,4.51097373134893
"9",9,4.48788748823809
"10",10,4.34603985656981
"11",11,4.28677073671406
"12",12,4.20065901625172
"13",13,4.02514194962519
"14",14,3.91360194972916
"15",15,3.85865748409081
"16",16,3.81318053258601
"17",17,3.70380706527433
"18",18,3.61552922363713
"19",19,3.61405310598722
"20",20,3.64591327503384
"21",21,3.70234435835577
"22",22,3.73503970503372
"23",23,3.81003078640584
"24",24,3.88201196162666
"25",25,3.89872518158949
"26",26,3.97432743542362
"27",27,4.2523675144599
"28",28,4.34654855854847
"29",29,4.49276038902684
"30",30,4.67830892029687
"31",31,4.91896819673664
"32",32,5.04350767355202
"33",33,5.09073406942046
"34",34,5.18510849382162
"35",35,5.18353176529036
"36",36,5.2210776270173
"37",37,5.22643491929207
"38",38,5.11137006553725
"39",39,5.01052467981257
"40",40,5.0361056705898
"41",41,5.18149486951409
"42",42,5.36334869132276
"43",43,5.43053620818444
"44",44,5.60001072279525

Edit

I tried an alternative method as well. I noticed that the object holtmodel contains two objects which might be useful. They are fitted and mean. As far as I can tell this is the fitted timeseries and the mean timeseries for the next 10 steps/predictions.

I tried plotting these objects with

lines(holtmodel$fitted, col="orange", lwd=2)
lines(holtmodel$mean, col="blue", lwd=2)

however the second of these fails to plot anything, despite no error being produced in the console. The first line plots an orange timeseries as expected.

Upvotes: 1

Views: 141

Answers (1)

Pierre Chevallier
Pierre Chevallier

Reputation: 754

Your issue

The objects you are trying to add as lines don't have the same length:

length(predict(holtmodel, data.frame(x=v)))
# 10
length(v)
# 44
length(predict(holtmodel, data.frame(x=v2)))
# 10
length(v2)
# 12

This means you can't add them as new lines.

Also, you can't really predict the same way you would with a linear regression by using say, older data as point to prepare the model. Exponential smoothing methods use historical data points to build future data points, you can't really display them for past events.

Also, you are not specifying the parameter for the number of periods you are trying to predict (h), I'll let you refer to the documentation on the holt function. It is already a prediction of future events that is the output, so the use of predict() on it doesn't change the result:

holt_predict <- predict(holtmodel)
length(setdiff(holt_predict, holtmodel))
# 0 which means they are the same objects

Solution

What you could do is use directly mean and fitted and plot them with lines, by also expanding the area to plot the chat with xlim and ylim to view the predicted values. You can directly plot holtmodel$fitted and holtmodel$mean on your chart, since they are time series objects:

plot(df, xlim=c(0, 60), ylim=c(2.5, 10))
lines(holtmodel$fitted, col="blue", pch=20, lwd=3)
lines(holtmodel$mean, col="red", pch=20, lwd=3)

And the result:

enter image description here

Easy alternative

To save you the hassle of having to go through this kind of solution there are easier methods. Have you tried the autoplot function included in the package forecast ? It is from ggplot2 and will give you what you want directly (unless you don't want the confidence intervals). It is very straightforward and will probably yield results close to what you want:

autoplot(holtmodel)

enter image description here

Upvotes: 1

Related Questions