Tuyen
Tuyen

Reputation: 1111

How to add new legends to complicated scatter plot using ggplot2

I built a simple linear regression model, and produced some predicted values using the model. However, I am more interested in visualizing it on the graph, but I do not know how to add a legend to highlight original mpg values as 'black' and new predicted values as "red".

Data used in this example is mtcars dataset from datasets package

    library(ggplot2)

    library(datasets)
    library(broom)

    # Build a simple linear model between hp and mpg

    m1<-lm(hp~mpg,data=mtcars)

    # Predict new `mpg` given values below 

    new_mpg = data.frame(mpg=c(23,21,30,28))

    new_hp<- augment(m1,newdata=new_mpg)

    # plot new predicted values in the graph along with original mpg values

    ggplot(data=mtcars,aes(x=mpg,y=hp)) + geom_point(color="black") + geom_smooth(method="lm",col=4,se=F) + 
      geom_point(data=new_hp,aes(y=.fitted),color="red") 

enter image description here

scatter plot

Upvotes: 4

Views: 720

Answers (2)

AK88
AK88

Reputation: 3026

Here is another way of doing it without dplyr:

ggplot() + 
  geom_point(data = mtcars, aes(x = mpg, y = hp, colour = "Obs")) +
  geom_point(data = new_hp, aes(x = mpg, y = .fitted, colour = "Pred")) +
  scale_colour_manual(name="Type",  
                      values = c("black", "red")) +
  geom_smooth(data = mtcars, aes(x = mpg, y = hp),
              method = "lm", col = 4, se = F)

Upvotes: 2

www
www

Reputation: 39174

Here is one idea. You can combine the predicted and observed data in the same data frame and then create the scatter plot to generate the legend. The following code is an extension of your existing code.

# Prepare the dataset
library(dplyr)

new_hp2 <- new_hp %>%
  select(mpg, hp = .fitted) %>%
  # Add a label to show it is predicted data
  mutate(Type = "Predicted")

dt <- mtcars %>%
  select(mpg, hp) %>%
  # Add a label to show it is observed data
  mutate(Type = "Observed") %>%
  # Combine predicted data and observed data
  bind_rows(new_hp2)

# plot the data
ggplot(data = dt, aes(x = mpg, y = hp, color = factor(Type))) + 
  geom_smooth(method="lm", col = 4, se = F) +
  geom_point() +
  scale_color_manual(name = "Type", values = c("Black", "Red"))

Upvotes: 4

Related Questions