Ridhima
Ridhima

Reputation: 187

plot of regression fitted values made using lines function in R just not turning right.

I have a big dataset with 100853 observations. I wish to determine the relationship between the 2 variables in my model i.e. log of per capita expenditure (ln_MPCE) and share of expenditure spent on food (w_food). To do this,I run a quadratic regression and a non-parametric regression. Then, I plot the data and the fitted values using the following code. However, the graphs are just not plotted right. Instead of getting 2 curves, I get a bunch of lines for both the regressions. Please tell me where I am going wrong. Thanks in advance for your help.

model.par <- lm(w_food~ ln_MPCE+ I(ln_MPCE^2), data=share_efm_food_09)
summary(model.par) 
library(np) 
model.np <- npreg(w_food~ ln_MPCE, regtype="ll",bwmethod="cv.aic",data=share_efm_food_09)

pdf("food_Ln_MPCE_curve.pdf" , width=11, height=8)
plot(share_efm_food_09$ln_MPCE, share_efm_food_09$w_food, xlab="ln_MPCE",ylab="w_food", cex=.1)
lines(share_efm_food_09$ln_MPCE, fitted(model.np), lty=1, col="blue")
lines(share_efm_food_09$ln_MPCE, fitted(model.par), lty=1, col="red")
dev.off() 

Upvotes: 0

Views: 2056

Answers (1)

eipi10
eipi10

Reputation: 93871

What's happening is that the data are not sorted by the x-value, so that lines go back and forth, depending on where the next x-value happens to be in the current ordering of your data frame. Order the data frame by the x-value to get the line you were expecting.

Here's an example with the built-in mtcars data frame:

m1 = lm(mpg ~ wt + I(wt^2), data=mtcars)

Plot data in default order:

with(mtcars, plot(wt, mpg))
lines(mtcars$wt, fitted(m1), col="blue")

enter image description here

Add a prediction line with data sorted by wt:

newdat = data.frame(wt=mtcars$wt, mpgpred=fitted(m1))
newdat = newdat[order(newdat$wt),]

lines(newdat, col="red", lwd=4)

enter image description here

Rather than using fitted, you can also use predict, which will return predicted values from your model for any combination of values of the independent variables. You can then provide the original data frame sorted by wt:

m1 = lm(mpg ~ wt + I(wt^2), data=mtcars)

with(mtcars, plot(wt, mpg))
lines(mtcars$wt[order(mtcars$wt)], predict(m1, newdata=mtcars[order(mtcars$wt),]), col="red")

Upvotes: 2

Related Questions