UweM.
UweM.

Reputation: 121

Identification of influential observations in regression with library car

Can somebody explain why library(car) finds influential observations here?:

library(car) 

x = seq(1, 5, len = 100)

set.seed(99)

y = 2*x + 1 + rnorm(length(x), 0, 0.00005)

plot(x,y)      # no influential observations!!

infl = influencePlot(lm(y ~ x)) 
infl # 4 influential observations?? 

Upvotes: 1

Views: 132

Answers (1)

StupidWolf
StupidWolf

Reputation: 46908

If you read the help page for the function:

The default ‘method="noteworthy"’ is used only in this function and indicates setting labels for points with large Studentized residuals, hat-values or Cook's distances.

And the default settings:

id=TRUE’ is equivalent to ‘id=list(method="noteworthy", n=2, cex=1, col=carPalette()1, location="lr")’

Using your example it looks like this:

enter image description here

It basically labels the 2 most extreme values for Studentized Residuals (y-axis) and 2 most extreme values for Hat-values (x-axis).

If you want the 3 most extreme, you can do:

influencePlot(lm(y ~ x),id=list(method="noteworthy",n=3))

Upvotes: 1

Related Questions