Reputation: 15
The following code is me splitting the 'Weekly' data set into training and testing datasets. My training data set is supposed to contain years 1990-2008 while my testing data set spans 2009-2010. The Weekly data set is a dataset in R.
weekly.train = split(Weekly, Weekly$Year == 1990:2008)
weekly.test = split(Weekly, Weekly$Year == 2009:2010)
When I do a logistic regression model to the training set I get this error:
"Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 1037, 52"
Here's my code for the regression:
mod.fit.lr<-glm(Direction ~ Lag1+Lag2+Lag3+Lag4+Lag5+Volume, data = weekly.train,family = binomial)
Upvotes: 0
Views: 176
Reputation: 4233
split
returns a list of two groups (TRUE
and FALSE
), while you would want to have only one group (the target set). You can either extract the TRUE
element or use indices explicitly:
i_test <- Weekly$Year %in% 2009:2019
weekly.test <- Weekly[i_test, ]
weekly.train <- Weekly[!i_test, ]
Upvotes: 0