Reputation: 2773
I am trying to plot confidence intervals around a line plot, similar to this: https://scikit-learn.org/0.17/_images/plot_gp_regression_001.png
I am fitting a gaussian process and when predicting values, for each, it returns a mean and a std (standard deviation). Using this, I should be able to able to plot different ranges of the confidence interval. For my case, I am trying to have ranges for 10%, 20%, ... 90%.
Currently I am doing something like this
y_pred, std = reg.predict(x, return_std=True)
std_z = 1.96 # from z-table for 95%
confidence_interval = std * std_z
plt.plot(x, y_pred)
plt.fill_between(x, y_pred - confidence_interval, y_pred + confidence_interval)
That works. According to the z-table (http://www.z-table.com/uploads/2/1/7/9/21795380/8573955.png?759), you can see that the z value is 1.96 for 95%. However, take for example 25% and 75%. The z values for those would be - and + 0.67 respective, which would just overlap in the confidence interval when plotting. This seems intuitively incorrect to me. I would expect expect shrinking bands for lower confidence ranges and expanding ones for increasing, right?
Any help would be appreciated.
Upvotes: 0
Views: 399
Reputation: 12461
Wrong. The percentages associated with confidence intervals (95%, 75%, 25% in your examples) are coverage probabilities. They're the chance that the true value of quantity you're estimating (the predicted value in this case) lies within the CI.. Given that the CI's you're talking about are central confidence intervals (that is, they are centred on the predicted value) it stands to reason that for a higher confidence, you need a wider interval. This is exactly what you are seeing. If a narrower confidence interval had a higher coverage probabiulity than a wider one, then that would imply that there was a region that somehow had a negative coverage probability associated with it. Probabilities can't be negative, so that's impossible.
Upvotes: 2