partial_fit with scikit-learn returns ValueError: The sum of the priors should be 1

Question

I am trying to run a sklearn.naive_bayes.GaussianNB model with partial_fit. For this I calculate the priors like this:

unique_lbls, counts = np.unique(labels, return_counts=True)
counts = counts.astype(float)
priors = counts / counts.sum()
model  = GaussianNB(priors=priors)
model.partial_fit(X, y, classes=unique_lbls)

I get an `ValueError: The sum of the priors should be 1, but I have checked and the priors do sum up to 1.0:

print priors.sum()
> 1.0

I am using the following versions:

Python 2.7.12
scikit-learn 0.18.2
numpy 1.13.1

I can only imagine that it comes down to sensitivity of the summed value, but I have tried to normalize the priors again with priors /= priors.sum() and it returns the same error.

Is there a different way to make sure that the priors sum to 1.0 with a higher tolerance, or is there some (to me not-)obvious reason this doesn't work?

Edit: labels is a numpy array with containing the whole data set's labels represented as integers, X and y are a batch of the full data set. y and labels both have at least 100 examples from each class.

partial_fit with scikit-learn returns ValueError: The sum of the priors should be 1

Answers (1)

Related Questions