santosh singh
santosh singh

Reputation: 28642

Why does a base64 encoded string have an = sign at the end

I know what base64 encoding is and how to calculate base64 encoding in C#, however I have seen several times that when I convert a string into base64, there is an = at the end.

A few questions came up:

  1. Does a base64 string always end with =?
  2. Why does an = get appended at the end?

Upvotes: 507

Views: 317987

Answers (10)

glades
glades

Reputation: 4737

The equals or double equals serves as padding. It's a stupid concept defined in RFC2045 and it is actually superfluous. Any decent parser can encode and decode a base64 string without knowing about padding by just counting up the number of characters and filling in the rest if size isn't divisible by 3 or 4 respectively. This actually leads to difficulties every now and then, because some parsers expect padding while others blatantly ignore it. My MPU base64 decoder for example needs padding, but it receives a non-padded base64 string over the network. This leads to erroneous parsing and I had to account for it myself.

Upvotes: 3

Badr Bellaj
Badr Bellaj

Reputation: 12821

Q Does a base64 string always end with =?

A: No. (the word usb is base64 encoded into dXNi)

Q Why does an = get appended at the end?

A: As a short answer:
The last character (= sign) is added only as a complement (padding) in the final process of encoding a message with a special number of characters.

You will not have an = sign if your string has a multiple of 3 characters, because Base64 encoding takes each three bytes (a character=1 byte) and represents them as four printable characters in the ASCII standard.

Example:

(a) If you want to encode

ABCDEFG <=> [ABC] [DEF] [G]

Base64 deals with the first block (producing 4 characters) and the second (as they are complete). But for the third, it will add a double == in the output in order to complete the 4 needed characters. Thus, the result will be QUJD REVG Rw== (without spaces).

[ABC] => QUJD

[DEF] => REVG

[G] => Rw==

(b) If you want to encode ABCDEFGH <=> [ABC] [DEF] [GH]

similarly, it will add one = at the end of the output to get 4 characters.

The result will be QUJD REVG R0g= (without spaces).

[ABC] => QUJD

[DEF] => REVG

[GH] => R0g=

Upvotes: 740

Vladimir Ignatev
Vladimir Ignatev

Reputation: 2176

= is a padding character. If the input stream has length that is not a multiple of 3, the padding character will be added. This is required by decoder: if no padding present, the last byte would have an incorrect number of zero bits.

Better and deeper explanation here: https://base64tool.com/detect-whether-provided-string-is-base64-or-not/

Upvotes: 6

iandotkelly
iandotkelly

Reputation: 9124

Its defined in RFC 2045 as a special padding character if fewer than 24 bits are available at the end of the encoded data.

Upvotes: 21

Legolas
Legolas

Reputation: 1502

From Wikipedia:

The final '==' sequence indicates that the last group contained only one byte, and '=' indicates that it contained two bytes.

Thus, this is some sort of padding.

Upvotes: 83

Dev
Dev

Reputation: 161

http://www.hcidata.info/base64.htm

Encoding "Mary had" to Base 64

In this example we are using a simple text string ("Mary had") but the principle holds no matter what the data is (e.g. graphics file). To convert each 24 bits of input data to 32 bits of output, Base 64 encoding splits the 24 bits into 4 chunks of 6 bits. The first problem we notice is that "Mary had" is not a multiple of 3 bytes - it is 8 bytes long. Because of this, the last group of bits is only 4 bits long. To remedy this we add two extra bits of '0' and remember this fact by putting a '=' at the end. If the text string to be converted to Base 64 was 7 bytes long, the last group would have had 2 bits. In this case we would have added four extra bits of '0' and remember this fact by putting '==' at the end.

Upvotes: 8

Ian Kemp
Ian Kemp

Reputation: 29839

  1. No.
  2. To pad the Base64-encoded string to a multiple of 4 characters in length, so that it can be decoded correctly.

Upvotes: 18

Thomas Leonard
Thomas Leonard

Reputation: 7196

It's padding. From http://en.wikipedia.org/wiki/Base64:

In theory, the padding character is not needed for decoding, since the number of missing bytes can be calculated from the number of Base64 digits. In some implementations, the padding character is mandatory, while for others it is not used. One case in which padding characters are required is concatenating multiple Base64 encoded files.

Upvotes: 10

Sam Holloway
Sam Holloway

Reputation: 2009

The equals sign (=) is used as padding in certain forms of base64 encoding. The Wikipedia article on base64 has all the details.

Upvotes: 13

Andrew Hare
Andrew Hare

Reputation: 351456

It serves as padding.

A more complete answer is that a base64 encoded string doesn't always end with a =, it will only end with one or two = if they are required to pad the string out to the proper length.

Upvotes: 335

Related Questions