Mass, momentum, and energy are some of the most fundamental properties of matter in our universe. But there is another property that might arguably be on par with these: information.
But what even is information? Somehow, it feels like a very intuitive concept—more of a “I know it when I see it” kind of thing. However, in 1948 (not 1984), Shannon published the seminal paper “A Mathematical Theory of Communication.” In it, he showed that not only is it possible to rigorously define what we mean by information, but it is also a measurable quantity!
So, what is the definition of information?
Well, information is clearly something different from simply having data. Even if I give you a map, you still might not find your way to my place without me providing an extra piece of information (assuming you haven’t been there before or don’t have an exceptionally good memory).
A unit of information, then, is in some sense something that allows you to narrow down the possible options within your data. More precisely:
“One bit of information is the amount of information required to choose between two equally probable alternatives.”
Confusingly, information is measured in units called bits. However, one bit of information is not the same as a binary number. A binary number can represent information. For example, if I tell you which left and right turns you need to take to reach my place, you could neatly describe it as a binary sequence, say: 10011010 (for R, L, L, R, R, L, R, L). This sequence contains 8 bits of information.
But here’s the catch: if you already know the way to my place, this binary number doesn’t represent information for you—it doesn’t reduce the number of possible routes you might consider. Information, in this sense, is tied to cutting down uncertainty.
—————————————————————————————–
To further our understanding of the concept of information, let’s consider the following example. Suppose we are given an image—how much information does it contain?
For simplicity, let’s consider the image in grayscale.

This image is 512×512 pixels with a grayscale of 256 values. In other words, every pixel represents one byte (2⁸ bits) of information. At this point, you might say to yourself, “Well, this is silly—clearly this image contains exactly 67,108,864 bits of information!”
That seems like a reasonable guess. The only problem is that it’s completely wrong.
But why is it wrong? Remember our definition: 1 bit of information allows us to choose between two equally likely alternatives. Clearly, however, the values of different pixels are not independent of each other (that would make for a rather boring picture). So there must be some redundancy in the description of our image. If we remove all of this redundancy, what we’re left with is the pure information of the image—the minimum amount of bits needed to recreate it without any losses.
So how do we find this minimal amount of information needed to represent an image? We’ll delve a bit deeper into that in an upcoming post.