Reputation: 13
I am trying to find an optimal parameter set for an XGB_Classifier using GridSearchCV. Since my data is very unbalanced, both fitting and scoring (in cross_validation) must be performed using weights, therefore I have to use a custom scorer, which takes a 'weights' vector as a parameter. However, I can't find a way to have GridSearchCV pass 'weights' vector to a scorer.
There were some attempts to add this functionality to gridsearch:
https://github.com/ndawe/scikit-learn/commit/3da7fb708e67dd27d7ef26b40d29447b7dc565d7
But they were not merged into master and now I am afraid that this code is not compatible with upstream changes.
Has anyone faced a similar problem and is there any 'easy' way to cope with it?
Upvotes: 1
Views: 822
Reputation: 36555
You could manually balance your training dataset as in the answer to Scikit-learn balanced subsampling
Upvotes: 1