Adrian
Adrian

Reputation: 9793

How to specify a particular optimization method in glm

In the documentation for glm, there is a method option where in the documentation it says that

User-supplied fitting functions can be supplied either as a function or a character string naming a function, with a function which takes the same arguments as glm.fit. If specified as a character string it is looked up from within the stats namespace.

I would like to use a binomial non-negative least squares optimizer so that the coefficients are non-negative and sum up to 1. An example of this optimizer being used is in the SuperLearner package with the option method = "method.NNLS". Below is a reproducible example:

library(SuperLearner)
# binary outcome
set.seed(1)
N <- 200
X <- matrix(rnorm(N*10), N, 10)
X <- as.data.frame(X)
Y <- rbinom(N, 1, plogis(.2*X[, 1] + .1*X[, 2] - .2*X[, 3] +
                           .1*X[, 3]*X[, 4] - .2*abs(X[, 4])))

SL.library <- c("SL.glmnet", "SL.glm", "SL.knn", "SL.mean")

# least squares loss function
test.NNLS <- SuperLearner(Y = Y, X = X, SL.library = SL.library,
                          verbose = TRUE, method = "method.NNLS", family = binomial())
> test.NNLS

Call:  
SuperLearner(Y = Y, X = X, family = binomial(), SL.library = SL.library, method = "method.NNLS",  
    verbose = TRUE) 


                   Risk      Coef
SL.glmnet_All 0.2460486 0.0000000
SL.glm_All    0.2507033 0.2423697
SL.knn_All    0.2508500 0.3493301
SL.mean_All   0.2475494 0.4083002

Notice that the Coefs are non-negative and sum to 1. Is there a way to use the same optimizer with glm? Or a different optimizer that uses the same constraint? I've tried:

glm(Y ~ as.matrix(X), family = "binomial", method = "method.NNLS")

but it did not work.

Upvotes: 2

Views: 485

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226097

A way (not the only way, possibly not the most efficient or most compact) to solve this problem is to do general maximum-likelihood estimation on parameters in a transformed space; probably the most common such transformation is the additive log ratio transformation (e.g. as in compositions::alr()). This solution uses the bbmle package, which is a wrapper around optim(); in this case, it doesn't actually offer that much advantage over directly using optim()

library(bbmle)
library(compositions)
linkfun <- function(mu) as.numeric(compositions::alr(mu))
invlink <- function(eta) as.numeric(compositions::alrInv(eta))

## negative log-likelihood function: first convert the parameter vector
## from the constrained (alr) space to the unconstrained space, then
## convert the linear predictor from an unconstrained space to the
## constrained (probability, (0,1)) space
lfun <- function(beta) {
    -sum(dbinom(Y, size=1, 
                prob = plogis(as.matrix(X) %*% invlink(beta)),
                log=TRUE))
}
npar <- ncol(X)

Using mle2:

## extra setup for using mle2 with a score function that takes a vector
## parameter
pnm <- paste0("beta", seq(npar))
parnames(lfun) <- pnm
    
m1 <- mle2(lfun, start = setNames(rep(0, npar-1), pnm[-1]), vecpar = TRUE)
invlink(coef(m1))
##  [1] 4.897617e-01 2.221793e-01 1.274830e-05 2.198596e-05 7.541545e-02
##  [6] 1.175966e-01 1.590407e-05 4.662563e-02 4.825376e-02 1.170077e-04

or

opt <- optim(par = rep(0, npar-1), fn = lfun, method = "BFGS")
invlink(opt$par)

note 1

My model fit doesn't match the method you used to generate the data (this would be coded in a formula as ~ 0 + x1 + x2 + x3 + x3:x4 + abs(x4)), but neither does your SuperLearner example, so I didn't worry about it.

note 2

My guess (not carefully considered) is that the approach above will not give the same results as using SuperLearner; looking at the code, what SuperLearner does is to find a non-negative solution to the (weighted) least-squares problem, then transform the coefficients to the compositional space by dividing the coefficient vector by its sum. This is not the same as solving the minimum-deviance/maximum-likelihood problem with the parameters constrained to be non-negative and sum to 1. If you want the SuperLearner solution, it may indeed be possible to do it by passing an appropriate method= argument to glm, but I couldn't see how to do the constrained MLE solution with glm.

Upvotes: 2

Related Questions