Christoph
Christoph

Reputation: 7063

ggplot2 geom_line magic with base R?

Is there an option, to get the same result of geom_line with base R? It feels like this should be easy, but when I tried to understand, what geom_line is doing (and how), I got lost in the code. (It should be possible to automate with an arbitrary number of "lines" - not only 2.)

Backround: I would like to display the "two lines from the fit" as in the code below, but I have not been successful. Any ideas?

Reproducible example:

library(ggplot2)
set.seed(1)
sd_age <- 1000
age <- sample(c(20:65), 24)
s_a1 <- 80000 + 100 * age[1:8]
s_a2 <- 70000 + 100 * age[9:24]
df <- data.frame(salary = c(s_a1, s_a2),
                 dep = c(rep("A1", length(s_a1)),rep("A2", length(s_a2))),
                 age = c(age[1:8], age[9:24]),
                 gender = c(0, 1, 0, 1, 0, 1, 0, 1,
                            1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1),
                 stringsAsFactors = FALSE)
df$gender <- as.factor(df$gender)
df$dep <- as.factor(df$dep)
df$salary <- df$salary + rnorm(nrow(df), 0, sd_age)
fit2 <- lm(salary ~ age + dep, data = df)
df$fit2 <- predict(fit2)

ggplot(df, aes(x = age, y = salary, shape = dep, colour = gender, fill = dep)) +
  geom_point(size = 3) +
  xlab("age") +
  ylab("salary") +
  ggtitle("whatever") +
  geom_line(data = df, 
            mapping = aes(x = age, y = fit2), size = 1.2, color = "blue")

enter image description here

The best I got is

plot(df$age[df$gender == 0], df$salary[df$gender == 0],
     xlim = c(18, 67), ylim = c(60000, 100000)) # men
points(df$age[df$gender == 1], df$salary[df$gender == 1], 
       col = "blue") # women
lines(df$age, df$fit2, col = "blue")

Upvotes: 0

Views: 57

Answers (1)

d.b
d.b

Reputation: 32548

Subset the data for each dep and then plot it separately

with(df, plot(age, salary,
              col = ifelse(gender == 0, "red", "blue"),
              pch = ifelse(gender == 0, 19, 15)))
for (grp in unique(df$dep)) {
    with(df[df$dep == grp,], lines(sort(age), fit2[order(age)], col = "blue"))
}

enter image description here

Upvotes: 2

Related Questions