Reputation: 115
For a very basic demonstration, I'm trying to show that the log transformation linear model is the best one for a given set of data. To demonstrate that I'm looking to compare it to standard lm, square root etc, to show that graphically, the log transform of the linear model fits best as compared to the other 2. The question is, how do create multiple overlapping different lm lines in one plot,? If I could label them that would also be great?
Here is sample true data with starter ggplot
library(tidyverse)
p=runif(100,1,100)
q=6+3*log(p)+rnorm(100)
sample <- data.frame(p,q)
ggplot(data = sample) +
geom_point(mapping = aes(x = p, y = q))
Upvotes: 1
Views: 775
Reputation: 227081
This doesn't handle the labeling (you could use annotate()
to add labels manually), but:
gg0 <- ggplot(data = sample, aes(x=p, y=q)) + geom_point()
gg0 + geom_smooth(method="lm", formula=y~x) +
geom_smooth(method="lm", formula=y~log(x), colour="red") +
geom_smooth(method="lm", formula=y~sqrt(x), colour="purple")
Upvotes: 1
Reputation: 5923
You could compute the lines yourself, e.g. like this:
# Make a tibble containing name of transform and the actual function
transforms <- tibble(Transform = c("log", "sqrt", "linear"),
Function = list(log, sqrt, function(x) x))
# Compute the regression coefs and turn it into a tidy table
lm_df <- transforms %>%
group_by(Transform) %>%
group_modify(~ {
lm(q ~ .x$Function[[1]](p), data = sample) %>%
broom::tidy() %>%
select(term, estimate) %>%
pivot_longer(estimate) %>%
mutate(Function = .x$Function)
})
> lm_df
# A tibble: 6 x 5
# Groups: Transform [3]
Transform term name value Function
<chr> <chr> <chr> <dbl> <list>
1 linear (Intercept) estimate 12.6 <fn>
2 linear .x$Function[[1]](p) estimate 0.0834 <fn>
3 log (Intercept) estimate 5.89 <fn>
4 log .x$Function[[1]](p) estimate 2.99 <fn>
5 sqrt (Intercept) estimate 9.35 <fn>
6 sqrt .x$Function[[1]](p) estimate 1.11 <fn>
# Evaluate the functions at different x values
lm_df <- lm_df %>%
pivot_wider(names_from = term, values_from = value) %>%
rename("Intercept" = `(Intercept)`, "Slope" = `.x$Function[[1]](p)`) %>%
group_modify(~ {
tibble(
y = .x$Intercept + .x$Slope * .x$Function[[1]](seq(0, max(sample$p))),
x = seq(0, max(sample$p))
)
})
> lm_df
# A tibble: 300 x 3
# Groups: Transform [3]
Transform y x
<chr> <dbl> <int>
1 linear 12.6 0
2 linear 12.7 1
3 linear 12.8 2
4 linear 12.9 3
5 linear 12.9 4
6 linear 13.0 5
7 linear 13.1 6
8 linear 13.2 7
9 linear 13.3 8
10 linear 13.4 9
# ... with 290 more rows
# Plot the functions
ggplot() +
geom_point(data = sample, mapping = aes(x = p, y = q)) +
geom_line(data = lm_df, aes(x = x, y = y, color = Transform))
Upvotes: 3