phildeutsch
phildeutsch

Reputation: 683

R package `penalized`: error when using predict() with matrix input for "penalized" and "unpenalized"

I am trying to use the penalized package to calibrate a penalized linear regression such that the coefficients of a subset of the variables are positive. I managed to calibrate the model but I am failing to use it for making new predictions.

Here's a toy example:

require(dplyr)
require(penalized)
require(ggplot2)

head(diamonds)
# A tibble: 6 x 10
#  carat       cut color clarity depth table price     x     y     z
#  <dbl>     <ord> <ord>   <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
#1  0.23     Ideal     E     SI2  61.5    55   326  3.95  3.98  2.43
#2  0.21   Premium     E     SI1  59.8    61   326  3.89  3.84  2.31
#3  0.23      Good     E     VS1  56.9    65   327  4.05  4.07  2.31
#4  0.29   Premium     I     VS2  62.4    58   334  4.20  4.23  2.63
#5  0.31      Good     J     SI2  63.3    58   335  4.34  4.35  2.75
#6  0.24 Very Good     J    VVS2  62.8    57   336  3.94  3.96  2.48

response = diamonds$price
penalized_vars = "x"
unpenalized_vars = "depth"
fit_penalized = penalized(response=as.matrix(response),
                          penalized = model.matrix(~., select_(diamonds, .dots = penalized_vars)), 
                          unpenalized = model.matrix(~., select_(diamonds, .dots = unpenalized_vars)),
                          model="linear", 
                          positive=TRUE,
                          maxiter=25)
# nonzero coefficients: 3

show(fit_penalized)
#Penalized linear regression object
#4 regression coefficients of which 3 are non-zero

#Loglikelihood =     -482648.2 

head(fitted(fit_penalized))
#         1          2          3          4          5          6 
#-1679.6983 -1924.0012 -1515.2681  -863.6912  -393.7956 -1668.7105 

So far so good. How can I actually use this to predict values based on new information? I have tried

predict(fit_penalized,
        penalized = model.matrix(~., select_(vars, .dots = penalized_vars)),
        unpenalized = model.matrix(~., select_(vars, .dots = unpenalized_vars)) )
# Error in terms.default(object@formula$unpenalized) : 
#   no terms component nor attribute

Upvotes: 1

Views: 514

Answers (1)

Zheyuan Li
Zheyuan Li

Reputation: 73265

Regarding the latest penalized_0.9-47 updated on 2016-05-27


As will be seen from my test below, unpenalized can only be specified via "formula" not "matrix", in order to get predict work. This seems to be a bug in the package. The bug is not with penalized but with predict. On one hand, ?penalized::predict says:

In particular, if penalized and/or unpenalized
was specified in matrix form, a matrix must be given with the new
subjects' data. The columns of these matrices must be exactly the
same as in the matrices supplied in the original call that
produced the ‘penfit’ object. If either penalized or unpenalized
was given as a ‘formula’ in the original call, the user of
‘predict’ must supply a new ‘data’ argument.

It seems legitimate to pass matrix to unpenalized, but my practical test denies this. So try contact package author.

As a side note, your current specification will have a penalized intercept, as well as a free intercept. You should drop one of them for identifiability. In the following, I drop the free intercept.

library(penalized)
library(ggplot2)
X1 <- model.matrix (~x, diamonds)    ## model matrix for penalized terms
X2 <- model.matrix(~ depth - 1, diamonds)    ## model matrix for free terms
vars <- diamonds[1:5, ]    ## prediction dataset
Xp1 <- model.matrix(~x, vars)    ## prediction matrix for penalized terms
Xp2 <- model.matrix(~ depth - 1, vars)    ## prediction matrix for free terms

## use "formula" for both
fit <- penalized (price, ~ x, ~ depth - 1, data = diamonds, model = "linear", positive = TRUE)
predict(fit, ~ x, ~ depth - 1, vars)
#          mu  sigma2
#1 -1523.5643 3600667
#2 -1328.9947 3600667
#3  -183.9204 3600667
#4  -950.1526 3600667
#5  -717.6854 3600667

## "matrix" for `penalized` and "formula" for `unpenalized`
fit <- penalized (price, X1, ~ depth - 1, data = diamonds, model = "linear", positive = TRUE)
predict(fit, Xp1, ~ depth - 1, vars)
#          mu  sigma2
#1 -1523.5643 3600667
#2 -1328.9947 3600667
#3  -183.9204 3600667
#4  -950.1526 3600667
#5  -717.6854 3600667

## "formula" for `penalized` and "matrix" for `unpenalized`
fit <- penalized (price, ~ x, X2, data = diamonds, model = "linear", positive = TRUE)
predict(fit, ~ x, Xp2, vars)
# Error in terms.default(object@formula$unpenalized) : 
#  no terms component nor attribute

## "matrix" for both
fit <- penalized (price, X1, X2, data = diamonds, model = "linear", positive = TRUE)
predict(fit, Xp1, Xp2, vars)
# Error in terms.default(object@formula$unpenalized) : 
#  no terms component nor attribute

Upvotes: 2

Related Questions