Reputation: 31
I'm currently using NLTK's Naive Bayes classifier, however I also wanted to try out the Max Ent classifier. It seems from the documentation that it should take the same format for the feature set as the Naive Bayes, but for some reason I am getting this error when I try it:
File "/usr/lib/python2.7/site-packages/nltk/classify/maxent.py", line 323, in train
gaussian_prior_sigma, **cutoffs)
File "/usr/lib/python2.7/site-packages/nltk/classify/maxent.py", line 1453, in train_maxent_classifier_with_scipy
model.fit(algorithm=algorithm)
File "/usr/lib64/python2.7/site-packages/scipy/maxentropy/maxentropy.py", line 1026, in fit
return model.fit(self, self.K, algorithm)
File "/usr/lib64/python2.7/site-packages/scipy/maxentropy/maxentropy.py", line 226, in fit
callback=callback)
File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 636, in fmin_cg
gfk = myfprime(x0)
File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 176, in function_wrapper
return function(x, *args)
File "/usr/lib64/python2.7/site-packages/scipy/maxentropy/maxentropy.py", line 420, in grad
G = self.expectations() - self.K
ValueError: shape mismatch: objects cannot be broadcast to a single shape
I'm not sure what this means, but I am using the same exact input as I am when I run Naive Bayes and that works.(Training data, represented as a list of pairs, the first member of which is a featureset, and the second of which is a classification label.) Any ideas?
Thanks!
Upvotes: 3
Views: 2392
Reputation: 1
you must install nltk then you can classify. use the code bellow to classify using maximum entropy in python
me_classifier = nltk.MaxentClassifier.train(trainset,algorithm="gis")
print(me_classifier.classify(testing))
Upvotes: 0
Reputation: 5971
This issue is also dependent on what version of scipy you are using.
NLTK makes use of scipy.maxentropy which was deprecated in scipy 0.10 and removed in 0.11, see the docs for it: http://docs.scipy.org/doc/scipy-0.10.0/reference/maxentropy.html#
I did create an issue for that on github: https://github.com/nltk/nltk/issues/307
Upvotes: 1
Reputation: 1059
I also encountered this problem with NLTK. While I was unable to resolve it satisfactorily (i.e. get Maxent working using scipy), I was able to train a maxent classifier in NLTK when I used a different algorithm. Try training with
me_classifier = nltk.MaxentClassifier.train(trainset,algorithm="iis")
or one of the other acceptable values for algorithm, like "gis" or "megam".
Upvotes: 3