Asha
Asha

Reputation: 51

sandwich::vcovHC() and coeftest::lmtest() returning NA values

I am currently building a regression model which helps explain sales using certain factors like income, temperature etc. On checking the residual plot after regression, the residuals are heteroscedastic.

To account for heteroscedasticity , I have made use of vcovHC() and coeftest() in R which can be used to re-calculate the standard errors along with their p-values under the assumption of Heteroscedasticity. But these functions return NA values and hence all corresponding p-values are also NAs. What could be the reason for this issue and how do I resolve it? The code it as below:

fit_p <- lm(formula = final_list_p, data = new_data_p)
s_p <- summary(fit_p)

The summary statistics for the output of linear regression is:

Residuals:                  
Min      1Q  Median      3Q     Max                     
-244190  -60770   -5759   59730  311108         
            
                
Coefficients:                   
Estimate                    
(Intercept)                 
        Std. Error  t   value   Pr(>|t|)    
var1    -3.36E+05   1.77E+05    -1.893  0.059026    .
var2    -2.90E+04   4.96E+03    -5.86   8.97E-09    ***
var3    -1.75E+05   8.93E+04    -1.958  0.050834    .
var4    -4.62E+00   2.80E+00    -1.653  0.098975    .
var5    2.39E+01    7.85E+00    3.04    0.002503    **
var6    -6.32E+04   1.08E+05    -0.588  0.556682    
var7    -5.38E+03   3.69E+04    -0.146  0.884204    
var8    6.03E+04    6.53E+04    0.923   0.356275    
var9    3.33E-01    4.75E-02    7.011   8.76E-12    ***
var10   -7.94E+04   2.33E+05    -0.34   0.73381 
var11   1.06E+05    1.08E+05    0.986   0.324424    
var12   -1.06E+04   4.41E+03    -2.39   0.017275    *
var14   5.44E+03    8.80E+02    6.182   1.43E-09    ***
var16   9.12E+04    7.34E+04    1.242   0.21481 
var18   1.78E+04    8.41E+04    0.211   0.832674    
var19   -1.75E+05   1.18E+05    -1.487  0.137787    
var20   4.19E+03    6.95E+02    6.023   3.58E-09    ***
var25   2.96E+00    4.82E-01    6.146   1.76E-09    ***

                
Residual standard error: 87850 on 447 degrees of freedom                
Multiple R-squared:  0.6144,    Adjusted R-squared:0.5958               
F-statistic: 39.57 on 18 and 447 DF,  p-value: <2.2e-16                 
                

When I check the residual plot,they are heteroskedastic.To account for this issue, the standard errors are recalculated using robust standard errors (sandwich::vcovHC) The results after performing the coeftest::lmtest is as follows:

s_p$coefficients <- unclass(coeftest(fit_p, vcov. = vcovHC))
        Estimate    Std.Error t-value Pr(>|t|)
Intercept-3.36E+05  NA          NA    NA
var1    -2.90E+04   NA          NA    NA
var2    -1.75E+05   NA          NA    NA
var3    -4.62E+00   NA          NA    NA
var4    2.39E+01    NA          NA    NA
var5    -6.32E+04   NA          NA    NA
var6    -5.38E+03   NA          NA    NA
var7    6.03E+04    NA          NA    NA
var8    3.33E-01    NA          NA    NA
var9    -7.94E+04   NA          NA    NA
var10   1.06E+05    NA          NA    NA
var11   -1.06E+04   NA          NA    NA
var12   5.44E+03    NA          NA    NA
var14   9.12E+04    NA          NA    NA
var16   1.78E+04    NA          NA    NA
var18   -1.75E+05   NA          NA    NA
var19   4.19E+03    NA          NA    NA
var20   2.96E+00    NA          NA    NA
var25   3.29E+03    NA          NA    NA

Upvotes: 5

Views: 5888

Answers (2)

Katin
Katin

Reputation: 177

You should specificate a type of vcovHC. For example:

coeftest(fit_p,vcov.=vcovHC(fit_p,type="HC1"))

This particular variant gives same results as in Eviews.

Upvotes: 4

Helix123
Helix123

Reputation: 3687

There is a try why the coeftest is not working with a robust vcov: If your model.matrix contrains very large values as well as very small values, the algorithm might not be able to do the computation numerically stable. Thus, have a look at model.matrix(formula , data=new_data_p) if this is the case. If so, try rescaling some variables in your (p)data.frame before model estimation (e.g. multiply oder divide by 100 oder 1000 [also log() makes sense sometimes). Take care, the interpretation of the coefficients changes due to the change of scales!

Upvotes: 1

Related Questions