Reputation: 11
I have run a series of multiple linear regression models and am running diagnostic plots using the method and code found via this link (http://www.r-bloggers.com/checking-glm-model-assumptions-in-r/)
I have no more than 53 data points for every model, however some of the outliers in the regression plots are labeled as above 53... ranging from 58-107. Do the labels of outliers or influential points in the regression plots not correlate with each individual data point? If so what do the labels mean and how do I know which of my data points are the outliers? I have counted my data points in my plots and none of them have more than 53.
I have attached a screenshot of my regression plot output. There are 53 points in this plot, however two of the notable points are labeled 90 and 106. Regression plot example
Upvotes: 1
Views: 401
Reputation: 132706
plot.lm
labels the points with the corresponding row names:
set.seed(42)
DF <- data.frame(x = 1:5, y = 2 + 3 * 1:5 + rnorm(5))
rownames(DF) <- letters[1:5]
DF$y[3] <- 1e3
mod <- lm(y ~ x, data = DF)
par(mfrow = c(2,2))
plot(mod, 1:4)
Upvotes: 1