krnbatta
krnbatta

Reputation: 489

How to draw original function, data points and linear regression curve on the same plot with R?

I have created a dummy data with x and y values. x is any value between 0 and 2*pi. y is sin(x) +- noise. noise is a random value between 0 and 0.5.

I have created a linear regression curve using this formula: fit <- lm(ys ~ xs + I(xs^2) + I(xs^3)). I am able to draw just the original sine function and the points with the following code:

plot(sin, 0, 2*pi,col="green",xlim = c(-0.5, 6.5), ylim = c(-1.5, 1.5))
points(xs, ys,col="blue")

I also want to add the fitted curve to the same plot. I did a bit of research and came up with the following code:

library(ggplot2)
ggplot(x = xs) + 
  stat_function(fun=sin, geom="line", col="green") +
  geom_point(aes(x = xs, y = ys), col="blue") +
  stat_smooth(method = "lm", formula = ys ~ xs + I(xs^2) + I(xs^3), col="red")

But it is just plotting the points. How to draw original function, data points and linear regression curve on the same plot with R?

Here is the whole code:

xs <- c(0, 2*pi)
ys <- c(runif(1,0,0.5), -runif(1,0,0.5))
for(i in 1:20){
  x <- runif(1, 0, 2*pi)
  y <- sin(x) 
  noise <- runif(1,0,0.5)
  if(i%%2 == 0){
    y <- y + noise
  }
  else{
    y <- y - noise
  }
  xs <- c(xs, x)
  ys <- c(ys, y)
}
data <- data.frame(xs, ys)
fit <- lm(ys ~ xs + I(xs^2) + I(xs^3))

#plot(sin, 0, 2*pi,col="green",xlim = c(-0.5, 6.5), ylim = c(-1.5, 1.5))
#points(xs, ys,col="blue")
#abline(fit)

library(ggplot2)

ggplot(x = xs) + 
  stat_function(fun=sin, geom="line", col="green") +
  geom_point(aes(x = xs, y = ys), col="blue") +
  stat_smooth(method = "lm", formula = ys ~ xs + I(xs^2) + I(xs^3), col="red")

Upvotes: 0

Views: 382

Answers (2)

teunbrand
teunbrand

Reputation: 38003

One of the lesser known use cases for the stat_function() layer is that you can plug in an anonymous function that predicts based on the linear model you have precalculated.

ggplot(x = xs) + 
  stat_function(fun=sin, geom="line", col="green") +
  geom_point(aes(x = xs, y = ys), col="blue") +
  stat_function(fun = function(x){predict(fit, data.frame(xs = x))}, col = "red")

Your attempt was very close, but the stat_smooth() layer needed to know what the aesthetics were, and the formula should be expressed as aesthetics.

ggplot(x = xs) + 
  stat_function(fun=sin, geom="line", col="green") +
  geom_point(aes(x = xs, y = ys), col="blue") +
  stat_smooth(method = "lm", formula = y ~ x + I(x^2) + I(x^3), 
              aes(x = xs, y = ys),
              col="red")

Upvotes: 2

Ben Norris
Ben Norris

Reputation: 5747

Here is a different approach. Modify your data.frame to have two more columns:

library(dplyr)
data <- data %>%
  mutate(sin_x = sin(xs), fit = predict(fit))

Now create your ggplot with three geom layers, one for ys, one for sin_x, and one for fit.

data %>%
  ggplot(aes(x = xs)) +
  geom_point(aes(y = ys)) + 
  geom_line(aes(y = sin_x), color = "red", size = 0.5) + 
  geom_line(aes(y = fit), color = "black", size = 1, linetype = 2)

enter image description here

Upvotes: 2

Related Questions