firmo23
firmo23

Reputation: 8404

Non finite value error when trying to use drm() to fit logistic regression model to the data

I have the dataframe below

d50<-structure(list(VOT = c("VOT 0", "VOT 5", "VOT 10", "VOT 15", 
"VOT 20", "VOT -10", "VOT -20", "VOT -30", "VOT -40", "VOT -50", 
"VOT -60", "VOT 0", "VOT 5", "VOT 10", "VOT 15", "VOT 20", "VOT -10", 
"VOT -20", "VOT -30", "VOT -40", "VOT -50", "VOT -60", "VOT 0", 
"VOT 5", "VOT 10", "VOT 15", "VOT 20", "VOT -10", "VOT -20", 
"VOT -30", "VOT -40", "VOT -50", "VOT -60", "VOT 0", "VOT 5", 
"VOT 10", "VOT 15", "VOT 20", "VOT -10", "VOT -20", "VOT -30", 
"VOT -40", "VOT -50", "VOT -60", "VOT 0", "VOT 5", "VOT 10", 
"VOT 15", "VOT 20", "VOT -10"), response = c(0, 0, 0, 0, 0, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
1, 1)), row.names = c(NA, -50L), class = c("tbl_df", "tbl", "data.frame"
))

and Im trying to

library(drc)

# Fit a logistic regression model to the data
fit <- drm(response ~ VOT, data = d50, fct = LL.4())

but I get:

Error in optim(startVec, opfct, hessian = TRUE, method = optMethod, control = list(maxit = maxIt,  : 
  non-finite value supplied by optim
Error in drmOpt(opfct, opdfct1, startVecSc, optMethod, constrained, warnVal,  : 
  Convergence failed

Upvotes: 0

Views: 292

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226087

You have at least two problems:

  1. Your predictor variable must be numeric. Try
d50x <- transform(d50, VOT = as.numeric(gsub("VOT ", "", VOT)))
  1. This is a very limited data set:
with(d50x, table(VOT, response))
library(ggplot2)
ggplot(d50x, aes(VOT, response)) + stat_sum() + 
    geom_smooth(method = "glm", method.args = list(family = binomial))

enter image description here

It is possible to fit a logistic regression to this:

glm(response ~ VOT, data = d50x, family = binomial)

I would think that the right model would be

fit <- drm(response ~ exp(VOT), data = d50x, fct = LL.2(), type = "binomial")

but I haven't gotten that working yet either ... (posting anyway in case it gives you or someone else a head start).

  1. The most subtle problem is that drm can't handle the wide scale of your data. It expects to get dosages on an original scale and fits a log-logistic model. I don't know what VOT is, but its range is from -60 to +20, which if we exponentiate gives a very wide range (and we can't log-transform, since it includes negative numbers).

I can get drm to fit with LL.2() by scaling, i.e. making exp(VOT/10) the response variable. This almost matches the results of a binomial GLM, but not quite, and the binomial GLM is better; if I re-fit using starting conditions derived from the GLM fit, I get an improved drm fit. (Code below.)

If you're interested in estimating the ED50 from this data set, I have some bad news - even with the better fit, the estimate is 26.4 with a standard error of 42, hence your 95% confidence range is approximately (-58, 110) ... although plotting the GLM fit (see below) suggests this is an overestimate, and the lower bound on the ED50 may be near 15 ...

gfit <- glm(response ~ I(VOT/10), data = d50x, family = binomial)
## gfit: prob = plogis(b0 + b1*x)
## fit: prob = plogis(b2*(b3-x))

## b2 = -b1
## b3 = -b0/b1

## fit with default start
fit <- drm(response ~ exp(VOT/10), data = d50x,
           fct = LL.2(), type = "binomial")


gc <- coef(gfit)
fit2 <- drm(response ~ exp(VOT/10), data = d50x,
            fct = LL.2(), type = "binomial",
            start = c(b=-gc[2], e = -gc[1]/gc[2]))

## stuff for plotting
VOT_vec <- seq(min(d50x$VOT), max(d50x$VOT), length = 51)
dp <- data.frame(x = exp(d50x$VOT/10), y = predict(fit))
dp <- unique(dp)
dp <- dp[order(dp$x),]

## plot
plot(fit2)

lines(exp(VOT_vec/10),
      predict(gfit, newdata = data.frame(VOT = VOT_vec),
              type = "response"),
      lty = 2, col = 2, lwd = 2)

lines(dp$x, dp$y,col = 4)
legend("bottomleft", c("drm (GLM start)", "GLM", "drm (default start)",
                       "data"),
       lty = c(1, 2, 1, NA),
       lwd = c(1, 2, 1, NA),
       pch = c(NA, NA, NA, 1),
       col = c(1, 2, 4, 1))

enter image description here

Plot glm fit with extended range:

ggplot(d50x, aes(VOT, response)) + stat_sum() +
    expand_limits(x = 50) +
    geom_smooth(method = "glm", method.args = list(family = binomial),
                fullrange = TRUE)

enter image description here

Upvotes: 2

Related Questions