Atticus29
Atticus29

Reputation: 4432

How do I ensure that my x and y lengths don't differ when plotting a glm using the predict() function in R?

I am running the following code:

c.model<-glm(cars$speed~cars$dist, family=gaussian)
summary(c.model)
c.x<-seq(0,1,0.01)
c.x
c.y<-predict.glm(c.model,as.data.frame(c.x), type="response")
c.y
plot(cars$dist)
lines(c.x,c.y)

And getting the error, "Error in xy.coords(x, y) : 'x' and 'y' lengths differ". I'm not quite sure what is causing this error.

Upvotes: 3

Views: 5969

Answers (1)

MrFlick
MrFlick

Reputation: 206536

You need to be more careful in matching up the variable names used in the model, and those used during prediction. The error you are getting is because the names in the data.frame in the preidct function do not match the names of the terms in your model so you're not actually predicting new values. The problem is that predict is essentially getting the data from

model.frame(~cars$dist, data.frame(dist=c.x))

so because you explicitly have cars$dist in your formula, there are no "free" symbols that will be taken from your newdata parameter. Compare that to the results from

model.frame(~dist, data.frame(dist=c.x))

This time, dist isn't specifically tied to the the cars variable and can be "resolved" in the context of the newdata data.frame.

Additionally, you want to make sure you're keeping your dist values on the same scale. For example.

c.model <- glm(speed~dist, data=cars, family=gaussian)
summary(c.model)
c.x <- seq(min(cars$dist),max(cars$dist),length.out=101)
c.y <- predict.glm(c.model,data.frame(dist=c.x), type="response")
plot(speed~dist, cars)
lines(c.x,c.y)

enter image description here

Here we predict over the range of observed values rather than 0-1 because no distance value is actually less than 1.

Upvotes: 7

Related Questions