Reputation:
I'm trying to run a weighted squares regression, after creating my weights and trying to add it to my regression function I receive the following error:
Error in model.frame.default(formula = CO2_pc_cmice1 ~ GDP_pc_cmice1_C + :
variable lengths differ (found for '(weights)')
The lm
model has 31 rows and the weights I've created are also 31, I've checked to see if there are NA
s in either of these and there are not. There are some negative numbers, although I'd be surprised if this was the issue. I've run the formula using both na.action = na.omit
and na.action = na.exclude
I'm also running this with a regression with a sample of 99 and I get the same issue.
My regression is
LinearCO2_lowerF <- (lm(CO2_pc_cmice1 ~ PolCiv_incPressFreedom_C + CorpInf_cmice1_C +
Gov_cmicepos1_C + LitGini_umice_C +
GDP_pc_cmice1_C + PopDensity_cmice1_C +
TradeOpen_cmice1_C + Urban_cmice1_C +
poly(Oil_coal_umice_C,2),
data = mydata_completemice2,
subset = IncomeL == "L"))
Weights created
wtsco2low <- 1/fitted( lm(abs(residuals(LinearCO2_lowerF))~fitted(LinearCO2_lowerF)) )^2
And the regression with weights
LinearCO2_lowerFw <- lm(CO2_pc_cmice1 ~ GDP_pc_cmice1_C + PolCiv_incPressFreedom_C +
CorpInf_cmice1_C + Gov_cmicepos1_C +
LitGini_umice_C + PopDensity_cmice1_C +
TradeOpen_cmice1_C + Urban_cmice1_C +
poly(Oil_coal_umice_C,2),
data = mydata_completemice2,
subset = IncomeL == "L",
weights = wtsco2low,
na.action = na.omit)
(Have also tried with na.exlude
)
Is anyone able to help?
Upvotes: 2
Views: 4577
Reputation: 2765
The subset=
argument of R modelling functions is applied to all the arguments. So, it looks as though your weights vector is being subsetted. Since it was already the right length, you get an error.
Consider this example: the data frame has 30 rows, but only 20 are in the subset to be analysed, and I have only 20 weights. If I use the subset=
argument, the weights get subsetted and there's an error.
Instead, you can use subset()
on the data before passing it to lm()
, and that works.
> d<-data.frame(y=rnorm(30),x=1:30)
> w<-rep(2,20)
>
> lm(y~x,data=d, subset=x>10)
Call:
lm(formula = y ~ x, data = d, subset = x > 10)
Coefficients:
(Intercept) x
-0.3161 0.0189
> lm(y~x,data=d, subset=x>10, weights=w)
Error in model.frame.default(formula = y ~ x, data = d, subset = x > 10, :
variable lengths differ (found for '(weights)')
> lm(y~x,data=subset(d, x>10), weights=w)
Call:
lm(formula = y ~ x, data = subset(d, x > 10), weights = w)
Coefficients:
(Intercept) x
-0.3161 0.0189
```
Upvotes: 4