Reputation: 490
I have the following example code:
library(mfp)
library(ggplot2)
duration <- sample(c(3, 5, 7, 10, 12, 14), 500, TRUE)
data <- data.frame(duration = duration,
score = -0.0125*duration^2 + 0.25*duration - 0.4 + rnorm(500, 0, 0.1))
mfp1 <- mfp(score ~ fp(duration), data)
ggplot(data, aes(x = duration, y = score)) +
geom_point() +
geom_smooth(method = "glm",
formula = mfp1$formula)
where I am generating some example data, fitting a fractional polynomial then plotting the scatter plot with the function on top. I am getting an error, as the formula
parameter in geom_smooth
needs to take a formala in the form y ~ x
, whereas the formula from mfp1
is score ~ I((duration/10)^1) + I((duration/10)^2)
.
Is there a way to convert the formula from mfp1
into a generic xy format for this use?
NB this is example data, the real application will be within a package, and therefore necessitates a general solution.
Upvotes: 1
Views: 217
Reputation: 174516
Inside geom_smooth
, you can use any method
that produces an object which has a predict
method, so instead of method = "glm"
you can use method = mfp
directly - at least in theory.
The problem is that mfp
doesn't accept a weights
argument, but geom_smooth
always passes one to the method
, even if the weights vector is empty or isn't used. Normally, this isn't a problem because many modelling functions have a "weights" argument, or "soak up" unused arguments by having a ...
parameter in the method definition. Unfortunately mfp
has neither. This means we need to define a little wrapper function that can take arbitrary extra arguments and not use them.
Note that in geom_smooth
, the formula
argument also has to be given in terms of x
and y
:
library(tidyverse)
library(mfp)
ggplot(data, aes(x = duration, y = score)) +
geom_point() +
geom_smooth(method = function(formula, data, ...) mfp(formula, data),
formula = y ~ fp(x))
Note that you don't need the actual model to draw the plot, since geom_smooth
re-runs the model when you plot.
Data used
set.seed(1)
duration <- sample(c(3, 5, 7, 10, 12, 14), 500, TRUE)
score <- -0.0125 * duration^2 + 0.25 * duration - 0.4 + rnorm(500, 0, 0.1)
data <- data.frame(score, duration)
Created on 2022-09-15 with reprex v2.0.2
Upvotes: 1