Logistic Regression with glmnet - structure of input data

Question

I am trying to apply Ridge and Lasso regression to a logistic regression model and am struggling to understand the required structure for the x and y inputs. I am fairly new to R, so apologies, and I hope this is clear. I believe we are using the values in the columns in x, to predict the outcomes in y

For x I have seven columns, each are categorical data (as factors). The whole of x is a dataframe with 9000 observations of 7 variables, each variable is a factor with varying levels in each. This appears in the Environment under Data

For y it is a set of outcomes - "0" or "1" - which appears in the Enviromnment as Values which says y is a Factor w/ 2 levels "0" "1", also with 9000 values

Struggling to work out what 'structure x and y need to be to get the following to work for a logistic model

alpha0.fit <- cv.glmnet(x, y , type.measure="deviance", alpha=0, family="binomial")

Any thoughts or advice gratefully received.

Logistic Regression with glmnet - structure of input data

Answers (1)

Related Questions