Reputation: 655
I have data that are strictly increasing and would like to fit a smoothing spline that is monotonically increasing as well with the smooth.spline()
function if possible, due to the ease of use of this function.
For example, my data can be effectively reproduced with the example:
testx <- 1:100
testy <- abs(rnorm(length(testx)))^3
testy <- cumsum(testy)
plot(testx,testy)
sspl <- smooth.spline(testx,testy)
lines(sspl,col="blue")
which is not necessarily increasing everywhere. Any suggestions?
Upvotes: 13
Views: 11835
Reputation: 8019
You could use shape-constrained splines for this, e.g. using the scam
package:
require(scam)
fit = scam(testy~s(testx, k=100, bs="mpi", m=5),
family=gaussian(link="identity"))
plot(testx,testy)
lines(testx,predict(fit),col="red")
Or if you would like to use L1 loss as opposed to L2 loss, which is less sensitive to outliers, you could also use the cobs
package for this...
Advantage of this method compared to the solution above is that it also works if the original data perhaps are not 100% monotone due to the presence of noise...
Upvotes: 6
Reputation: 1138
This doesn't use smooth.spline()
but the splinefun(..., method="hyman")
will fit a monotonically increasing spline and is also easy to use. So for example:
testx <- 1:100
testy <- abs(rnorm(length(testx)))^3
testy <- cumsum(testy)
plot(testx,testy)
sspl <- smooth.spline(testx,testy)
lines(sspl,col="blue")
tmp <- splinefun(x=testx, y=cumsum(testy), method="hyman")
lines(testx[-1], diff(tmp(testx)), col="red")
Yields the following figure (red are the values from the monotonically increasing spline)
From the help file of splinefun
: "Method "hyman" computes a monotone cubic spline using Hyman filtering of an method = "fmm" fit for strictly monotonic inputs. (Added in R 2.15.2.)"
Upvotes: 13
Reputation: 1259
I would suggest using loess
for this type of monotonically increasing function.
Examining spline's derivative we see that it is negative and non-trivial in some cases:
> plot(testx,testy)
> sspl <- smooth.spline(testx,testy)
> min(diff(sspl$y))
[1] -0.4851321
If we use loess, I think this problem will be less severe.
d <- data.frame(testx,testy)
fit.lo <- loess(testy ~ testx,data=d)
lines(fit.lo$x,fit.lo$y)
Then checking the derivative we get:
> min(diff(fit.lo$y))
[1] 1.151079e-12
Which is essentially 0. At near 0, we sometimes get a trivially small negative value.
Here is an example of the above loess fit.
Not sure if this will hold in all cases but it seems to do a better job than spline.
Upvotes: 0