Ivan
Ivan

Reputation: 20101

Confidence interval for statsmodels OLS model prediction

I have a very similar problem to this question and it works for the training data. Now I´m trying to get the confidence interval for the predicted data:

from statsmodels.sandbox.regression.predstd import wls_prediction_std
#define y, X, X_forecast as pandas dataframes
regressor = sm.api.OLS(y, X).fit()
wls_prediction_std(regressor.predict(X_forecast))

But, of course, gives an error complaining about regressor.predict being an array. How can I calculate the confidence interval for the predicted regression values?

Upvotes: 1

Views: 2205

Answers (2)

JeeyCi
JeeyCi

Reputation: 579

import matplotlib.pyplot as plt
import statsmodels.api as sm

data = sm.datasets.longley.load()
x = sm.add_constant(data.exog.iloc[:,2])
y= data.endog

mod = sm.OLS(y, x).fit()
print(mod.summary(alpha=0.01))
print(mod.conf_int(alpha=0.01, cols=None))

##e.g.
### use 99 % CI
##print(results.summary(alpha=0.01))

pred_ols = mod.get_prediction()

# mean confidence interval
iv_l = pred_ols.summary_frame()["mean_ci_lower"]
iv_u = pred_ols.summary_frame()["mean_ci_upper"]

# prediction interval
##iv_l = pred_ols.summary_frame()["obs_ci_lower"]
##iv_u = pred_ols.summary_frame()["obs_ci_upper"]

fig, ax = plt.subplots(figsize=(8, 6))
x1= x.iloc[:,1]
ax.plot(x1, y, "bo", label="data", )
ax.plot(x1, mod.fittedvalues, "r--.", label="OLS")
##ax.plot(x1, pred_ols.summary_frame()["mean"], "b--.", label="OLS")
ax.plot(x1, iv_u, "ro")
ax.plot(x1, iv_l, "ro")
ax.legend(loc="best")
plt.show()

pred_ols = mod.get_prediction()

# prediction interval
iv_l = pred_ols.summary_frame()["obs_ci_lower"]
iv_u = pred_ols.summary_frame()["obs_ci_upper"]


fig, ax = plt.subplots(figsize=(8, 6))
x1= x.iloc[:,1]
ax.plot(x1, y, "bo", label="data", )
ax.plot(x1, mod.fittedvalues, "r--.", label="OLS")
ax.plot(x1, iv_u, "ro")
ax.plot(x1, iv_l, "ro")
ax.legend(loc="best")
plt.show()

p.s. confidence-and-prediction-intervals

Upvotes: 1

SamLin
SamLin

Reputation: 11

you may have put the wrong parameter.

Let's try this one :

wls_prediction_std(regressor, exog=X_forecast, weights=None, alpha=0.05)

Upvotes: 1

Related Questions