StatsG0d
StatsG0d

Reputation: 1

How to add a custom legend in ggplot2 in R

I want to plot a data set where the size of the points are proportional to the x-variable and have a regression line with a 95% prediction interval. The "sample" code I have written is as follows:

  # Create random data and run regression
  x <- rnorm(40)
  y <- 0.5 * x + rnorm(40)
  plot.dta <- data.frame(y, x)
  mod <- lm(y ~ x, data = plot.dta)

  # Create values for prediction interval
  x.new <- data.frame(x = seq(-2.5, 2.5, length = 1000))
  pred <- predict(mod,, newdata = x.new, interval = "prediction")
  pred <- data.frame(cbind(x.new, pred))

  # plot the data w/ regression line and prediction interval

  p <- ggplot(pred, aes(x = x, y = upr)) + 
    geom_line(aes(y = lwr), color = "#666666", linetype = "dashed") +
    geom_line(aes(y = upr), color = "#666666", linetype = "dashed") +
    geom_line(aes(y = fit)) + 
    geom_point(data = plot.dta, aes(y = y, size = x))
  p

This produces the following plot:
Plot produced by ggplot

Obviously, the legend is not too helpful here. I would like to have one entry in the legend for the points, say, labeled "data", one grey, dashed line labeled "95% PI" and one entry with a black line labeled "Regression line."

Upvotes: 0

Views: 2838

Answers (1)

Jake Kaupp
Jake Kaupp

Reputation: 8072

As Hack-R alluded in the provided link, you can set the breaks and labels for scale_size() to make that legend more meaningful.

You can also construct a legend for all your geom_line() calls by adding linetype into your aes() and use a scale_linetype_manual() to set the values, breaks and labels.

 ggplot(pred, aes(x = x, y = upr)) + 
  geom_line(aes(y = lwr, linetype = "dashed"), color = "#666666") +
  geom_line(aes(y = upr, linetype = "dashed"), color = "#666666") +
  geom_line(aes(y = fit, linetype = "solid")) + 
  geom_point(data = plot.dta, aes(y = y, size = x)) +
  scale_size(labels = c("Eensy-weensy", "Teeny", "Small", "Medium", "Large")) +
  scale_linetype_manual(values = c("dashed" = 2, "solid" = 1), labels = c("95% PI", "Regression Line"))

Upvotes: 2

Related Questions