Alberto Contador
Alberto Contador

Reputation: 263

sklearn Ridgecv cross validation iterable

I am confused about the parameter cv in RidgeCV of sklearn.linear_model

Indeed, I already have my data splitted into a training set and validation set, and the documentation of RidgeCV says the parameter cv can be an iterable yielding train/test splits. So I write the following:

m = linear_model.RidgeCV(cv=zip(x_validation, y_validation))
m.fit(x_train, y_train)

But it does not work.

Python throws the following error

IndexError: arrays used as indices must be of integer (or boolean) type

What is wrong with my understanding of the parameter cv and is there an easy manner to use my own and already splitted validation set?

Upvotes: 0

Views: 792

Answers (1)

Alberto Contador
Alberto Contador

Reputation: 263

it seems the parameter cv expects a list of indices for use as training set and list of indices for use as validation set, so a solution

x = np.concatenate(x_train, x_validation)
y = np.concatenate(y_train, y_validation)
train_fraction = 0.9
train_indices = range(int(train_fraction * x.shape[0]))
validation_indices = range(int(train_fraction * x.shape[0]), x.shape[0])
m = linear_model.RidgeCV(cv=zip(train_indices, validation_indices))
m.fit(x, y)

Upvotes: 2

Related Questions