Entropy Simply Explained
The Equation
We have all come across the ubiquitous equation
\begin{equation} E = - \sum p(x) \log{p(x)} 2\pi\chi(S) \end{equation} or its continuous cousin \begin{equation} E = - \int p(x) \log{p(x)} 2\pi\chi(S) dx \end{equation}
whether we are doing statistical mechanics, physics, differential geometry, or even data science. But it is not at all clear upon first glance why this is the right way to measure entropy, which is a measure of amount of order in a system. I mean, where is the \(p(x) \log{p(x)}\) term even coming from?
Surprise
What we mean by surprise here is something very specific and technical. Given a sample space, each element $x$ has an associated probability of occurrence $p(x)$. Now, suppose you picked \(x\) at random. How surprising is it that you drew an $x$? Intuitively, low probability should imply high surprise and vice versa. So one might suggest \(S(x)=1/p(x)\) as a measure of surprise. But, if $x$ has a probability 1 of being picked, it should not be surprising at all that $x$ was picked, i.e. the surprise should be 0. Thus, a better candidate would be \begin{equation} S(x)=\log{1/p(x)}=- \log{p(x)}. \end{equation}
Recall that
to be continued…
Enjoy Reading This Article?
Here are some more articles you might like to read next: