Reputation: 35
I am trying to fit this data to a weibull distribution:
My x
and y
variables are:
y <- c(1, 1, 1, 4, 7, 20, 7, 14, 19, 15, 18, 3, 4, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1)
x <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24)
The plot looks like this:
I am looking for something like this: fitted plot
and I want to fit a weibull curve to it. I am using the nls function in R like this:
nls(y ~ ((a/b) * ((x/b)^(a-1)) * exp(- (x/b)^a)))
This function always throws up an error saying:
Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model
In addition: Warning message:
In nls(y ~ ((a/b) * ((x/b)^(a - 1)) * exp(-(x/b)^a))) :
No starting values specified for some parameters.
Initializing ‘a’, ‘b’ to '1.'.
Consider specifying 'start' or using a selfStart model
So first I tried different starting values without any success. I cannot understand how to make a "good" guess at the starting values.
Then I went with the SSweibull(x, Asym, Drop, lrc, pwr)
function which is a selfStart function. Now the SSWeibull function expects values for Asym,Drop,lrc and pwr and I don't have any clue as to what those values might be.
I would appreciate if someone could help me figure out how to proceed.
Background of the data: I have taken some data from bugzilla and my "y" variable is number of bugs reported in a particular month and "x" variable is the month number after release.
Upvotes: 3
Views: 3526
Reputation: 32416
You might consider modifying your formula to better fit the data. For example, you could add an intercept (since your data flatlines at 1 instead of 0, which the model wants to do) and a scalar multiplier since you aren't actually fitting a density.
It is always worth spending some time really thinking about what parameters make sense, because model fitting procedures are often quite sensitive to initial estimates. You can also do a grid-search where you come up with ranges of possible parameters and try fitting the model with the various combinations using error catching functions. nls2
has an option to do this for you.
So, for example,
## Put the data in a data.frame
dat <- data.frame(x=x, y=y)
## Try some possible parameter combinations
library(nls2)
pars <- expand.grid(a=seq(0.1, 100, len=10),
b=seq(0.1, 100, len=10),
c=1,
d=seq(10, 100, len=10))
## brute-force them
## note the model has changed slightly
res <- nls2(y ~ d*((a/b) * ((x/b)^(a-1)) * exp(- (x/b)^a)) + c, data=dat,
start=pars, algorithm='brute-force')
## use the results with another algorithm
res1 <- nls(y ~ d*((a/b) * ((x/b)^(a-1)) * exp(- (x/b)^a)) + c, data=dat,
start=as.list(coef(res)))
## See how it looks
plot(dat, col="steelblue", pch=16)
points(dat$x, predict(res), col="salmon", type="l", lwd=2)
Not perfect, but it's a start.
Upvotes: 4