Reputation: 263
I am confused about the parameter cv in RidgeCV
of sklearn.linear_model
Indeed, I already have my data splitted into a training set and validation set, and the documentation of RidgeCV
says the parameter cv
can be an iterable yielding train/test splits. So I write the following:
m = linear_model.RidgeCV(cv=zip(x_validation, y_validation))
m.fit(x_train, y_train)
But it does not work.
Python throws the following error
IndexError: arrays used as indices must be of integer (or boolean) type
What is wrong with my understanding of the parameter cv
and is there an easy manner to use my own and already splitted validation set?
Upvotes: 0
Views: 792
Reputation: 263
it seems the parameter cv expects a list of indices for use as training set and list of indices for use as validation set, so a solution
x = np.concatenate(x_train, x_validation)
y = np.concatenate(y_train, y_validation)
train_fraction = 0.9
train_indices = range(int(train_fraction * x.shape[0]))
validation_indices = range(int(train_fraction * x.shape[0]), x.shape[0])
m = linear_model.RidgeCV(cv=zip(train_indices, validation_indices))
m.fit(x, y)
Upvotes: 2