jackahall
jackahall

Reputation: 490

Plotting mfp formula in geom_smooth

I have the following example code:

library(mfp)
library(ggplot2)

duration <- sample(c(3, 5, 7, 10, 12, 14), 500, TRUE) 
data <- data.frame(duration = duration, 
                   score = -0.0125*duration^2 + 0.25*duration - 0.4 + rnorm(500, 0, 0.1))

mfp1 <- mfp(score ~ fp(duration), data)

ggplot(data, aes(x = duration, y = score)) +
  geom_point() +
  geom_smooth(method = "glm",
              formula = mfp1$formula)

where I am generating some example data, fitting a fractional polynomial then plotting the scatter plot with the function on top. I am getting an error, as the formula parameter in geom_smooth needs to take a formala in the form y ~ x, whereas the formula from mfp1 is score ~ I((duration/10)^1) + I((duration/10)^2).

Is there a way to convert the formula from mfp1 into a generic xy format for this use?

NB this is example data, the real application will be within a package, and therefore necessitates a general solution.

Upvotes: 1

Views: 217

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 174516

Inside geom_smooth, you can use any method that produces an object which has a predict method, so instead of method = "glm" you can use method = mfp directly - at least in theory.

The problem is that mfp doesn't accept a weights argument, but geom_smooth always passes one to the method, even if the weights vector is empty or isn't used. Normally, this isn't a problem because many modelling functions have a "weights" argument, or "soak up" unused arguments by having a ... parameter in the method definition. Unfortunately mfp has neither. This means we need to define a little wrapper function that can take arbitrary extra arguments and not use them.

Note that in geom_smooth, the formula argument also has to be given in terms of x and y:

library(tidyverse)
library(mfp)

ggplot(data, aes(x = duration, y = score)) +
  geom_point() +
  geom_smooth(method = function(formula, data, ...) mfp(formula, data), 
              formula = y ~ fp(x))

enter image description here

Note that you don't need the actual model to draw the plot, since geom_smooth re-runs the model when you plot.


Data used

set.seed(1)
duration <- sample(c(3, 5, 7, 10, 12, 14), 500, TRUE)
score    <- -0.0125 * duration^2 + 0.25 * duration - 0.4 + rnorm(500, 0, 0.1)
data     <- data.frame(score, duration)

Created on 2022-09-15 with reprex v2.0.2

Upvotes: 1

Related Questions