Reputation: 5504
Take this case (classic crab data for logistic regression):
> library(glmnet)
> X <- read.table("http://www.da.ugent.be/datasets/crab.dat", header=T)[1:10,]
> Y <- factor(ifelse(X$Sa > 0, 1, 0))
> Xnew <- data.frame('W'=20,'Wt'=2000)
> fit.glmnet <- glmnet(x = data.matrix(X[,c('W','Wt')]), y = Y, family = "binomial")
Now I want to predict new values from Xnew
:
According to the docs I can use predict.glmnet
:
type
Type of prediction required. Type "link" gives the linear predictors for "binomial", "multinomial", "poisson" or "cox" models; for "gaussian" models it gives the fitted values. Type "response" gives the fitted probabilities for "binomial" or "multinomial", [...]
So this is what I do:
> predict.glmnet(object = fit.glmnet, type="response", newx = as.matrix(Xnew))[,1:5]
s0 s1 s2 s3 s4
-0.8472979 -0.9269763 -1.0057390 -1.0836919 -1.1609386
> predict.glmnet(object = fit.glmnet, type="link", newx = as.matrix(Xnew))[,1:5]
s0 s1 s2 s3 s4
-0.8472979 -0.9269763 -1.0057390 -1.0836919 -1.1609386
Same values for both link
as response
predictions, which is not what I expect. Using predict
seems to give me the correct values:
> predict(object = fit.glmnet, type="response", newx = as.matrix(Xnew))[,1:5]
s0 s1 s2 s3 s4
0.3000000 0.2835386 0.2678146 0.2528080 0.2384968
> predict(object = fit.glmnet, type="link", newx = as.matrix(Xnew))[,1:5]
s0 s1 s2 s3 s4
-0.8472979 -0.9269763 -1.0057390 -1.0836919 -1.1609386
Is this a bug, or am I using predict.glmnet
in a wrong way?
Upvotes: 2
Views: 2240
Reputation: 3947
Within the packet glmnet
, your object is of class lognet
:
> class(object)
[1] "lognet" "glmnet"
That's why you are not getting the right result with predict.glmnet
, which internally does not support type="response"
, but you will get it if you use predict.lognet
:
> predict.lognet(object = fit.glmnet, newx = as.matrix(Xnew), type="response")[,1:5]
s0 s1 s2 s3 s4
0.3000000 0.2835386 0.2678146 0.2528080 0.2384968
> predict.lognet(object = fit.glmnet, newx = as.matrix(Xnew), type="link")[,1:5]
s0 s1 s2 s3 s4
-0.8472979 -0.9269763 -1.0057390 -1.0836919 -1.1609386
Anyway I would recommend you that you use predict
, and let R resolve internally which function to use.
Hope it helps.
Upvotes: 2