Reputation: 8031
Dirk Eddelbuettel provides alternatives to estimating a linear regression with the lm
command. See: http://dirk.eddelbuettel.com/blog/2011/07/05/
However, he mentions:
"Strictly-speaking, it is the only one we can compare to lm.fit() which also uses a pivoting scheme. In the case of a degenerated model matrix, all the other methods, including the four fastest approaches, are susceptible to producing incorrect estimates."
Could someone illustrate this point by providing an example when the estimates are correct for lm, but not for the alternative approaches?
Upvotes: 1
Views: 106
Reputation: 368271
Install RcppArmadillo or RcppEigen and look at help(fastLm)
:
## case where fastLm breaks down
dd <- data.frame(f1 = gl(4, 6, labels = LETTERS[1:4]),
f2 = gl(3, 2, labels = letters[1:3]))[-(7:8), ]
xtabs(~ f2 + f1, dd) # one missing cell
mm <- model.matrix(~ f1 * f2, dd)
kappa(mm) # large, indicating rank deficiency
set.seed(1)
dd$y <- mm %*% seq_len(ncol(mm)) + rnorm(nrow(mm), sd = 0.1)
summary(lm(y ~ f1 * f2, dd)) # detects rank deficiency
summary(fastLm(y ~ f1 * f2, dd)) # some huge coefficients
We owe this example to Doug Bates.
Upvotes: 3