Reputation: 312
When I use an object that stores the link function for the MASS package's glm.nb
function, the function errors out. However, when I just hard-code the link function, it works fine. I think this has to be due to some sort of environmental referencing gone wrong in their source code, but wanted to be certain before emailing the authors. So, any insight is appreciated here.
MWE
# Simulate some data
set.seed(666)
n = 1000
x = cbind(1, rnorm(n))
b = c(1,-2)
lam = exp(x%*%b)
y = rnbinom(n, size=.3, mu = lam)
# get package
if(!("MASS" %in% rownames(installed.packages()))
install.packages("MASS")
library(MASS)
Attempt 1 -- Define link as parameter default
# Define some wrapper function around glm.nb
mfn = function(y, x, mylink = "log"){
MASS::glm.nb(y ~ x - 1, link=mylink)
}
# Call mfn on simulated data
mfn(y, x)
> Error in poisson(link = mylink) : object 'mylink' not found
# Call with explicit link in mfn
mfn(y = y, x = x, mylink="log")
> Error in poisson(link = mylink) : object 'mylink' not found
Attempt 2 -- Modifying mfn
to define mylink
in function environment
# Redefine function
mfn = function(y, x){
mylink = "log"
MASS::glm.nb(y ~ x - 1, link=mylink)
}
# Call on simulated data
mfn(y = y, x = x)
> Error in poisson(link = mylink) : object 'mylink' not found
Attempt 3 Combine (1) and (2)
# Redefine mfn
mfn = function(y, x, mylink = "log"){
mylink = "log"
MASS::glm.nb(y ~ x - 1, link = mylink)
}
mfn(y = y, x = x, mylink="log")
> Error in poisson(link = mylink) : object 'mylink' not found
Just to be sure -- Hard-code link function
mfn = function(y, x){
mylink = "log"
MASS::glm.nb(y ~ x - 1, link="log")
}
mfn(y = y, x = x)
> Call: MASS::glm.nb(formula = y ~ x - 1, link = "log", init.theta = 0.2953628307)
Coefficients:
x1 x2
0.9415 -1.9165
Degrees of Freedom: 1000 Total (i.e. Null); 998 Residual
Null Deviance: 6191
Residual Deviance: 853.8 AIC: 4248
So its pretty clear that there is some sort of referencing issue going on, but I can't immediately find it in the source code nor do I know of a simpler way to test the issue. Any ideas?
Upvotes: 2
Views: 887
Reputation: 312
Okay so I've reached out to the MASS package maintainers, and Bill Venables kindly responded with a working solution for me. For completeness, I've pasted it below:
### negative binomial with parametric link function
ngb <- function(formula, data = list(), link) {
stopifnot(require(MASS))
thisCall <- substitute(MASS::glm.nb(formula = FORM, data = DATA, link = LINK),
list(FORM = substitute(formula),
DATA = substitute(data),
LINK = link))
eval.parent(thisCall)
}
library(MASS)
set.seed(5678)
N <- 1000
dummy_data <- within(data.frame(x = rnorm(N)), {
eta <- 1 - 2*x
mu <- exp(eta)
y <- rnegbin(eta, mu, theta = 3)
})
tst1 <- ngb(y ~ x, dummy_data, link = "log")
attach(dummy_data)
tst2 <- ngb(y ~ x, link = "log")
detach()
tst3 <- ngb(y ~ poly(x, 2), dummy_data, link = "log")
rbind(coef(tst1), coef(tst2))
summary(tst3)
Upvotes: 1
Reputation: 2594
Try the assignment operator (<-
) instead of the equals sign (=
) when you set the function to the name mfn
.
It appears that the MASS::glm.nb
function does some non-standard evaluation (NSE) on the link parameter. If you look at the documentation, the input is an unquoted log
, with other options as sqrt
and identity
. Again note the missing quotes around them.
Without getting too into NSE, I think the simplest solution is to use ...
as a parameter and just pass in the link argument again. This should still give you the flexibility to change your model and add more parameters if you so choose.
set.seed(666)
n = 1000
x = cbind(1, rnorm(n))
b = c(1,-2)
lam = exp(x%*%b)
y = rnbinom(n, size=.3, mu = lam)
mfn <- function(y, x, ...){
MASS::glm.nb(y ~ x - 1, ...)
}
# Call mfn on simulated data
mfn(y, x, link = log)
which outputs
Call: MASS::glm.nb(formula = y ~ x - 1, link = log, init.theta = 0.2953628307)
Coefficients:
x1 x2
0.9415 -1.9165
Degrees of Freedom: 1000 Total (i.e. Null); 998 Residual
Null Deviance: 6191
Residual Deviance: 853.8 AIC: 4248
As a side note in case you want to see how the MASS::glm.nb
function is doing this object manipulation, you can see it in the first few lines of the source code by typing in the function name without parentheses.
MASS::glm.nb
# function (formula, data, weights, subset, na.action, start = NULL,
# etastart, mustart, control = glm.control(...), method = "glm.fit",
# model = TRUE, x = FALSE, y = TRUE, contrasts = NULL, ...,
# init.theta, link = log)
# {
# loglik <- function(n, th, mu, y, w) sum(w * (lgamma(th +
# y) - lgamma(th) - lgamma(y + 1) + th * log(th) + y *
# log(mu + (y == 0)) - (th + y) * log(th + mu)))
# link <- substitute(link) <--- This is the line that is messing things up
# ...
Upvotes: 1