Reputation: 2853
I am doing a project that has some Natural Language Processing to do. I am using stanford MaxEnt Classifier for the purpose.But I am not sure, whether Maximum entropy model and logistic regression are one at the same or is it some special kind of logistic regression?
Can anyone come up with an explanation?
Upvotes: 11
Views: 7797
Reputation: 626
You may wanna read the attachment in this post, which gives a simple derivation: http://www.win-vector.com/blog/2011/09/the-equivalence-of-logistic-regression-and-maximum-entropy-models/
An explanation is quoted from "Speech and Language Processing" by Daniel Jurafsky & James H. Martin.:
Each feature is an indicator function, which picks out a subset of the training observations. For each feature we add a constraint on our total distribution, specifying that our distribution for this subset should match the empirical distribution we saw in our training data. We then choose the maximum entropy distribution which otherwise accords with these constraints.
Berger et al. (1996) show that the solution to this optimization problem turns out to be exactly the probability distribution of a multinomial logistic regression model whose weights W maximize the likelihood of the training data!
Upvotes: 5
Reputation: 2898
In Max Entropy the feature is represnt with f(x,y), it mean you can design feature by using the label y and the observerable feature x, while, if f(x,y) = x it is the situation in logistic regression.
in NLP task like POS, it is common to design feature's combining labels. for example: current word ends with "ous" and next word is noun. it can be feature to predict whether the current word is adj
Upvotes: 3
Reputation: 66835
This is exactly the same model. NLP society prefers the name Maximum Entropy and uses the sparse formulation which allows to compute everything without direct projection to the R^n space (as it is common for NLP to have huge amount of features and very sparse vectors).
Upvotes: 7