What is information in the context of entropy?

Question

I am trying to wrap my head around the concept of information in the context of entropy. Let me first introduce some things to make it clear what I mean with the terms I am using.

Entropy: [1]: https://en.wikipedia.org/wiki/Entropy_(information_theory)

"In information theory, the entropy of a random variable is the average level of "information", "surprise", or "uncertainty" inherent to the variable's possible outcomes."

\sum_{i=-1}^n - p_i * log(p_i)

So the question that came up for me was: What is information and how do we quantify it? Now I've read a lot of times that -log_2(p_i) (Solution to: 2^x= 1/p_i) tells us how many bits of Information the event i with probability p_i has. So for example, if I have a fair coin the number of bits of information I have for tails (or heads) is -log(0.5)=1 and the total entropy is H(p)=0.5 * 1 + 0.5 * 1 = 1. This should give me the average amount of information (number of bits) I obtain when flipping the fair coin.

So far so good. But what if the coin isn't fair? Let's say p(heads)=0.1, p(tails)=0.9. According to the definition I get H(p)= 0.468996. Which tells me that on average I get only around 0.47 bits of information when flipping this coin. But why is there a difference? Since intuitively, in both cases I am only getting the information whether it's heads or tails, in other words zero or one, that's 1 bit. If I just want to obtain the result of the coin toss, I am not really interested in the probability of each event anyway. It is especially confusing for me that apparently the information value of heads (-log_2(0.1)) is much higher than that of tails (-log_2(0.9)).

The only way I can make sense of the terminology is in the following example: Imagine you want to find a mushroom in a forest, which is split in two parts. One part is a third of the area and the other 2 thirds and the mushroom's location is random (uniformly distributed). And there is exactly one mushroom in the whole forest per season. If some magic mashine tells you that it's in the first part, it makes sense to me that this message contains more information since it effectively divides the area you have to search by a factor of 3. The essence is that if you would be satisfied with only knowing in which part of the forest the mushroom is, you wouldnt care how large the area is (i.e. how high the probability is), its just: is it the first or the second part.

What is information in the context of entropy?

Answers (1)

Related Questions