Scaling for estimating scale and shape parameter with fitdist function (fitdistrplus package)

Question

As specified in the title I have a scaling problem with the fitdist function in R (fitdistrplus package).

Please have a look at the following code:

# Initialize arrays for storing result
fit_store_scale <- rep(NA, 3)
fit_store_shape <- rep(NA, 3)

# load data
data1 <- c(7.616593e-05, 5.313253e-05, 1.604328e-04, 6.482365e-05,
           4.217499e-05, 6.759114e-05, 3.531301e-05, 1.934228e-05,
           6.263665e-05, 8.796205e-06)
data2 <- c(7.616593e-06, 5.313253e-06, 1.604328e-05, 6.482365e-06,
           4.217499e-06, 6.759114e-06, 3.531301e-06, 1.934228e-06,
           6.263665e-06, 8.796205e-07)
data3 <- c(7.616593e-07, 5.313253e-07, 1.604328e-06, 6.482365e-07,
           4.217499e-07, 6.759114e-07, 3.531301e-07, 1.934228e-07,
           6.263665e-07, 8.796205e-08)
# form data frame
data <- data.frame(data1, data2, data3)

# set scaling factor
scaling <- 1        #works without warnings and errors at:    
                    #10000 (data1), 100000 (data2) or
                    #1000000 (data3)

# store scale and shape parameter of data1, data2 and data3 in Array
for(i in 1:3)
{
    fit.w1 <- fitdist(data[[i]]*scaling,"weibull", method = "mle")
    fit_store_scale[i] <- fit.w1$estimate[[2]]*1/scaling
    #1/scaling is needed for correcting scale parameter
    fit_store_shape[i] <- fit.w1$estimate[[1]]
}

I have three vectors of data, which are stored in a data frame. Now I want to use the fitdist function for estimating the scale and shape parameter separately for each column of data (data1, data2 and data3) and finally storing them in fit_store_scale and fit_store_shape respectively.

The problem here is that the fitdist function doesn't work without an appropriate scaling factor and that data1, data2 and data3 need different factors. I am looking for a solution to determine an optimal scaling factor automatically for each column of data and so getting the fitdist function to work in the end.

Ben Bolker · Accepted Answer

If you're not absolutely wedded to fitdist, you could use something a little bit more robust -- the following fits the Weibull with the parameters on the log scale, and uses Nelder-Mead rather than a gradient-based approach. It doesn't seem to have any problems fitting these data.

dd <- data.frame(data1,data2,data3)
library("bbmle")
fx <- function(x) {
    m1 <- mle2(y~dweibull(shape=exp(logshape),scale=exp(logscale)),
           data=data.frame(y=x),start=list(logshape=0,logscale=0),
           method="Nelder-Mead")
    exp(coef(m1))
}
t(sapply(dd,fx))  ## not quite the output format you asked for,
                  ##  but easy enough to convert.
##       logshape     logscale
## data1 1.565941 6.589057e-05
## data2 1.565941 6.589054e-06
## data3 1.565941 6.589055e-07

This approach should work reasonably well for any distribution for which you have a standard distribution (d*()) function.

Scaling for estimating scale and shape parameter with fitdist function (fitdistrplus package)

Answers (2)

Related Questions