Discrepancy Between Two Methods of Finding Information Entropy

Question

So I learned about the concept of information entropy from Khan Academy where is was phrased in the form of "average amount of yes or no questions needed per symbol". They also gave an alternative form using logarithms.

So let's say we have a symbol generator that produces A,B, and C. P(A)=1/2, P(B)=1/3, and P(C)=1/6 According to their method, I would gat a chart like this: First method

Then I would multiply their probability of occurring by the amount of questions needed for each giving (1/2)*1+(1/3)*2+(1/6)*2 = 1.5bits

but their other method gives -(1/2)log2(1/2)-(1/3)log2(1/3)-(1/6)log2(1/6)= 1.459... bits

The difference is small, but still significant. I've tried this with different combinations and probabilities and got similar results. Is there something I'm missing? Am I using either method wrong, or is one of them more conditional?

Discrepancy Between Two Methods of Finding Information Entropy

Answers (1)

Related Questions