sai kamal
sai kamal

Reputation: 61

Getting an error "'xreg' and 'newxreg' have different numbers of columns"

I just started with R and time series forecasting. I am doing forecasting for 1 variable (consumption) and one exogenous variable (income). This is quarterly data. When I ran the model with R code,

    #train_exp <- exp_trial[,1][1:150]    
    #train_inc <- exp_trial[,2][1:150]    

enter image description here

    model_train_exp <- arima(train_exp,order = c(0,2,6),seasonal = list(order=c(0,1,1),period = 4), xreg = train_inc)    

this model has no errors. but, when I forecast it, i get an error xreg' and 'newxreg' have different numbers of columns

    forcasted_arima <- forecast.Arima(model_train_exp, h=14)    

there are so many arguments for forecast.arima. I am not familiar with those. Can someone please tell me what should be the code for it?

Upvotes: 4

Views: 13172

Answers (1)

Pierre L
Pierre L

Reputation: 28441

The model used train_inc to make the model. It needs more train_inc values in order to finish the prediction. Think of it this way, you built the model in the form train_exp_t0 = b1 + b2*train_exp_t-1 + b3*train_inc_t0. With that model in hand, if someone provides a value for train_exp_t-1 (which is yesterday's consumption) and one for train_inc_t0 (today's income value) the model will return a train_exp_t0 (today's consumption). You need to provide it with some train_inc values to get a y out.

Example

train_exp = rnorm(20)
train_inc = 1 + rnorm(20)

fit <- arima(train_exp, xreg=train_inc)
predict(fit, h=14)
# Error in predict.Arima(fit, h = 14) : 
#   'xreg' and 'newxreg' have different numbers of columns

We get the same error that you got. But when we supply new values for train_inc it works!

new_train_inc <- rnorm(14)

predict(fit, newxreg=new_train_inc)
# $pred
# Time Series:
#   Start = 21 
# End = 34 
# Frequency = 1 
# [1] -0.2444872 -0.1583624 -0.2042488 -0.2143231 -0.1992276 -0.2047153 -0.2431517 -0.1887002 -0.2480745 -0.2118920
# [11] -0.1281492 -0.2067001 -0.2202669 -0.2166019
# 
# $se
# Time Series:
#   Start = 21 
# End = 21 
# Frequency = 1 
# [1] 1.153433

If it still doesn't make sense, remember that you are predicting train_exp, not train_inc.

If you would like a more formal discussion see here at Cross Validated

Upvotes: 5

Related Questions