Suppose that you want to tell someone something, that is, convey
to them some information.

A Single Two-way Choice

Further suppose that you are not using English or some other natural language,
but that you only can speak in some code language, which only has two symbols.

In Morse Code, these symbols are called "dot" and "dash", or sometimes
"dit" and "dah" (Reference: Wikipedia), but the point is simply that there are just two of them.

To convey the information consisting of a single choice made between two
alternatives, we can just use one of those symbols.

A popular choice of symbols is 0 and 1, which happen to also be numbers.

That is the basic Unit of Information, and it is called "one bit".

In a physical device, such as a computer,
the two symbols are represented somehow, such as:

An electrical current being 'on' or 'off'.

A voltage being 'high' or 'low'.

A magnet being polarized in one direction or another

Or, outside the world of electronics:

A musical note being 'high' or 'low'.

A sound lasting a short or long time, as is the case with Morse Code.

And so on.

Some examples:

A question might have the answers "Yes" (1) or "No" (0).

The choice of a pet might be limited to "Cat" (0) or "Dog" (1).

Someone might wear glasses (1) or not (0).

Shirts might come in just two colors Red (0), or Blue (1).

Now, that's interesting as far as it goes, but things are rarely
that simple. Let's combine those things:

We can string symbols together, in a given order, let's say that's
{ Pet, Glasses, Shirt }.

To inform someone of the choices involved requires 3 bits of information,
and since each one has two possibilities, there are 2 x 2 x 2, or 8, possible
"words".

Multi-way Choices

What about cases where there are more than two choices?

Then we need more than a single bit.

Let's say our shirt manufacturer gets bored with just two colors of shirts,
and wants to add a few more to just Red and Blue:

Orange, Yellow, Green, Violet, Black, White - 6 more, for a total of 8.

As we saw above, where there were eight { Pet, Glasses, Shirt } combinations,
we will need 3 bits of information to distinguish these 8 choices.
To wit,

Got it? Test your understanding

Exercise:

How many bits of information are conveyed by a single Latin letter, A - Z ?

We have seen that 3 bits will distinguish 8 choices,
but the alphabet has 26 letters, so we're going to need more.

If we use one more - 4 bits - that doubles the choices to 16,
but that's still not quite enough.

In fact we need 5 bits, which would be enough for 32 choices,
so we'll have a few encodings left over:

Interestingly, if we want to also encode the lower case letters,
a - z, we only need one more bit.

There are two ways to see that:

A - Z and a - z constitute 52 letters, and 6 bits can encode
64 choices, which is more than enough.

We could use the same codes we used above, and simply prepend
a "1" to each of them. (And consider the earlier ones as prefixed
by a "0").