Mat Vicky
Mat Vicky

Reputation: 13

I am trying to plot confidence intervals on a regression graph.

I have tried a few methods, but all of them gives me an error saying that the x and y values are not the same.

This is what i have done so far.

head(new_data[,4:5])

 Humerus   Radius
 4.992607 4.921148
 5.170484 5.049856
 5.005623 4.936989
 5.065755 4.976734
 4.219508 4.174387
 4.262680 4.157319

plot(new_data$Radius, new_data$Humerus, main="Regression analysis of Humerus to radius", 
 xlab="Radius ", ylab=" Humerus", pch=19)
abline(lm(new_data$Humerus~ new_data$Radius))

This is the graph I got

And to do the confidence intervals, I tried this

lm.out <- lm(new_data$Humerus ~ new_data$Radius)
newx = seq(min(new_data$Radius),max(new_data$Humerus),by = 0.05)
conf_interval <- predict(lm.out, newdata=data.frame(x=newx), interval="confidence",level = 0.95)
plot(new_data$Radius, new_data$Humerus, xlab="Radius", ylab="Humerus", main="Regression")
abline(lm.out, col="lightblue")
lines(newx, conf_interval[,2], col="blue", lty=2)
lines(newx, conf_interval[,3], col="blue", lty=2)

When I do this, error message keep popping up saying -

 Warning message:
'newdata' had 61 rows but variables found have 150 rows

I'm really stuck here, any kind of help would very much appreciated. Thank you

Upvotes: 1

Views: 901

Answers (1)

farnsy
farnsy

Reputation: 2470

When you use predict(), newdata should be a dataframe containing the same names as the explanatory variables in the original dataset. In your original dataset, the x variable was called "new_data$Radius". You need to make two changes:

  1. When you do your original regression, just use the variable names

    lm.out <- lm(Humerus~Radius,data=new_data)

  2. When you pass your newdata parameter, let the name be "Radius" not "x"

    conf_interval <- predict(lm.out, newdata=data.frame(Radius=newx), interval="confidence",level = 0.95)

So your code should read

lm.out <- lm(Humerus ~ $Radius,data=new_data)
newx = seq(min(new_data$Radius),max(new_data$Humerus),by = 0.05)
conf_interval <- predict(lm.out, newdata=data.frame(Radius=newx), interval="confidence",level = 0.95)
plot(new_data$Radius, new_data$Humerus, xlab="Radius", ylab="Humerus", main="Regression")
abline(lm.out, col="lightblue")
lines(newx, conf_interval[,2], col="blue", lty=2)
lines(newx, conf_interval[,3], col="blue", lty=2)

Upvotes: 1

Related Questions