Reputation: 727
I am currently working with a dataset that has a very large number of variables. Therefore, I decided to use the sparse group LASSO variable selection technique, implemented on the SGL package.
My problem is a logistic regression problem, which is one of the possible models to build using this package. However, when I try to use it, I get an error message. My data frame is called N, and my binary vector is called y:
> x <- as.matrix(N)
> y <- as.matrix(Y)
> data <- list(x, y=y)
> sgl_small <- cvSGL(data, groups, type="logit")
Error: NA/NaN/Inf in foreign function call (arg 1)
In the situation before, Y was a binary numeric vector of zeros and ones, so I thought that the problem would be that Y was not a factor, so I tried another time:
> x <- as.matrix(N)
> y <- as.factor(Y))
> data <- list(x, y=y)
> sgl_small <- cvSGL(data, groups, type="logit")
Error in seq.default(log(max.lam),
log(min.lam), (log(min.lam) - log(max.lam))/(nlam - :
'from' cannot be NA, NaN or infinite
In addition: Warning messages:
1: In mean.default(y) : argument is not numeric or logical: returning NA
2: In mean.default(y) : argument is not numeric or logical: returning NA
3: In Ops.factor(y, m.y) : '-' not meaningful for factors
So this error message seems to indicate that y should not be a factor. I don't know what is going wrong, specially because if I run the cvSGL function considering y as a numeric binary vector, but I build a linear model rather than a logit model (although a linear model is not meaingful for me), it actually works and does not give any error.
I am referring to apply this:
> y <- as.matrix(Y)
> data <- list(x, y=y)
> sgl_small <- cvSGL(data, groups, type="linear")
I would thank any help, if anyone else have tried to use this package to build a logit model.
Upvotes: 1
Views: 684
Reputation: 821
I found this example on the help page of cvSGL
set.seed(1)
n = 50; p = 10;
X = matrix(rnorm(n * p), ncol = p, nrow = n)
beta = (-2:2)
y = sample(c(0,1),50, replace = T)
data = list(x = X, y = y)
cvFit = cvSGL(data, type = "logit")
As you can see, the parameter "index" (you called it groups) wasn't used in this situation. I don't see how you defined the index in your case. I guess the problem is that you need to define the name of your elements list
data <- list(x = x, y=y)
Upvotes: 1