Reputation: 51
I am currently building a regression model which helps explain sales using certain factors like income, temperature etc. On checking the residual plot after regression, the residuals are heteroscedastic.
To account for heteroscedasticity , I have made use of vcovHC()
and coeftest()
in R which can be used to re-calculate the standard errors along with their p-values under the assumption of Heteroscedasticity. But these functions return NA values and hence all corresponding p-values are also NAs. What could be the reason for this issue and how do I resolve it? The code it as below:
fit_p <- lm(formula = final_list_p, data = new_data_p)
s_p <- summary(fit_p)
The summary statistics for the output of linear regression is:
Residuals:
Min 1Q Median 3Q Max
-244190 -60770 -5759 59730 311108
Coefficients:
Estimate
(Intercept)
Std. Error t value Pr(>|t|)
var1 -3.36E+05 1.77E+05 -1.893 0.059026 .
var2 -2.90E+04 4.96E+03 -5.86 8.97E-09 ***
var3 -1.75E+05 8.93E+04 -1.958 0.050834 .
var4 -4.62E+00 2.80E+00 -1.653 0.098975 .
var5 2.39E+01 7.85E+00 3.04 0.002503 **
var6 -6.32E+04 1.08E+05 -0.588 0.556682
var7 -5.38E+03 3.69E+04 -0.146 0.884204
var8 6.03E+04 6.53E+04 0.923 0.356275
var9 3.33E-01 4.75E-02 7.011 8.76E-12 ***
var10 -7.94E+04 2.33E+05 -0.34 0.73381
var11 1.06E+05 1.08E+05 0.986 0.324424
var12 -1.06E+04 4.41E+03 -2.39 0.017275 *
var14 5.44E+03 8.80E+02 6.182 1.43E-09 ***
var16 9.12E+04 7.34E+04 1.242 0.21481
var18 1.78E+04 8.41E+04 0.211 0.832674
var19 -1.75E+05 1.18E+05 -1.487 0.137787
var20 4.19E+03 6.95E+02 6.023 3.58E-09 ***
var25 2.96E+00 4.82E-01 6.146 1.76E-09 ***
Residual standard error: 87850 on 447 degrees of freedom
Multiple R-squared: 0.6144, Adjusted R-squared:0.5958
F-statistic: 39.57 on 18 and 447 DF, p-value: <2.2e-16
When I check the residual plot,they are heteroskedastic.To account for this issue, the standard errors are recalculated using robust standard errors (sandwich::vcovHC) The results after performing the coeftest::lmtest is as follows:
s_p$coefficients <- unclass(coeftest(fit_p, vcov. = vcovHC))
Estimate Std.Error t-value Pr(>|t|)
Intercept-3.36E+05 NA NA NA
var1 -2.90E+04 NA NA NA
var2 -1.75E+05 NA NA NA
var3 -4.62E+00 NA NA NA
var4 2.39E+01 NA NA NA
var5 -6.32E+04 NA NA NA
var6 -5.38E+03 NA NA NA
var7 6.03E+04 NA NA NA
var8 3.33E-01 NA NA NA
var9 -7.94E+04 NA NA NA
var10 1.06E+05 NA NA NA
var11 -1.06E+04 NA NA NA
var12 5.44E+03 NA NA NA
var14 9.12E+04 NA NA NA
var16 1.78E+04 NA NA NA
var18 -1.75E+05 NA NA NA
var19 4.19E+03 NA NA NA
var20 2.96E+00 NA NA NA
var25 3.29E+03 NA NA NA
Upvotes: 5
Views: 5888
Reputation: 177
You should specificate a type of vcovHC. For example:
coeftest(fit_p,vcov.=vcovHC(fit_p,type="HC1"))
This particular variant gives same results as in Eviews.
Upvotes: 4
Reputation: 3687
There is a try why the coeftest is not working with a robust vcov: If your model.matrix contrains very large values as well as very small values, the algorithm might not be able to do the computation numerically stable. Thus, have a look at model.matrix(formula , data=new_data_p)
if this is the case. If so, try rescaling some variables in your (p)data.frame before model estimation (e.g. multiply oder divide by 100 oder 1000 [also log()
makes sense sometimes). Take care, the interpretation of the coefficients changes due to the change of scales!
Upvotes: 1