Cazk
Cazk

Reputation: 41

Partial regression plot in R

I'm quite new to R and I would love to get some help with creating a partial regression plot for a research project.

Here is my full model:

model2 <- lm(scenarios_anger ~ 1 + scenarios + age + female + politics + relg + spirit, data=data)

My goal is to create a scatterplot that only presents the relationship between scenarios and scenarios_anger while holding all the other predictors constant.

After some asking around, I figured out that I need to 1) create another model predicting scenarios_anger from all the other predictors apart from scenarios, and then take the residual from this model. And 2) create a third model predicting scenarios from all the other predictors in the full model, then take the residual from this model as well.

I have calculated the residuals as such:

model1 <- lm(scenarios_anger ~ 1 + age + female + politics + rely + spirit, data=data)

resid.model1 <- residuals(model1)

model2b <- lm(scenarios ~ 1 + age + female + politics + relg + spirit, data=data)

resid.model2b <- residuals(model2b)

The problem is that I can't seem to plug in my residual values into the ggplot function to create the scatterplot. I tried this command:

ggplot(data, aes(x = resid.model2b, y = resid.model1)) + geom_point(na.rm=T)

But I get this error message saying Error: Aesthetics must be either length 1 or the same as the data (786): x, y

I wonder if it's because my residuals are not in the right class for the ggplot function? How can I resolve this problem? Or is there another way to create my partial regression plot?

Upvotes: 4

Views: 8559

Answers (1)

Ben
Ben

Reputation: 1486

The problem here is that the residuals you want to use in the plot are newly created objects that are not in the data frame data. Hence, when you give your ggplot command it is looking for residual variables that aren't there. You should either add the residual variables back into the data frame data or create a new data frame with these residuals for the purpose of your plot. For the latter method, you would do something like this:

#Create the residuals (same as in your question)
model1 <- lm(scenarios_anger ~ 1 + age + female + politics + rely + spirit, data=data);
resid.model1 <- residuals(model1);
model2b <- lm(scenarios ~ 1 + age + female + politics + relg + spirit, data=data);
resid.model2b <- residuals(model2b);

#Create new data frame containing those residuals
NEWDF <- data.frame(RES1 = resid.model1, RES2b = resid.model2b);

#Now generate your plot using this NEWDF as your data
library(ggplot2);
ggplot(data = NEWDF, aes(x = resid.model2b, y = resid.model1)) + geom_point(na.rm=T);

This should ensure that the variables referenced in aes are in the data object being called by ggplot and so it should solve your problem. If you are still getting an error please let me know.

Upvotes: 2

Related Questions