MD5 Hash and Base64 encoding

Joaquín L. Robles picture Joaquín L. Robles · Nov 25, 2010 · Viewed 46.3k times · Source

If I have a 32 character string (an MD5 hash) and I encode it using Base64, what's the maximun length of the encoded string?

Answer

Thomas Albright picture Thomas Albright · Nov 8, 2012

An MD5 value is always 22 (useful) characters long in Base64 notation. Many Base64 algorithms will also append 2 characters of padding when encoding an MD5 hash, bringing the total to 24 characters. The padding adds no useful information and can be discarded. Only the first 22 characters matter.

Here's why:

An MD5 hash is a 128-bit value. Every character in a Base64 string contains 6 bits of information, because there are 64 possible values for the character, and it takes 6 powers of 2 to reach 64. With 6 bits of information in every character, 21 characters has 126 bits of information, and 22 characters contains 132 bits of information. Since 128 bits cannot fit within 21 characters but does fit within 22 characters (with a little room to spare), a 128-bit value will always be represented as 22 characters in Base64.

A note on the padding:

I mentioned above that many Base64 encoding algorithms add a couple of characters of padding when encoding an MD5 value. This is because Base64 represents 3 bytes of information as 4 characters. Since MD5 has 16 bytes of information, many Base64 encoding algorithms append "==" to designate that the input of 16 bytes was 2 bytes short of the next multiple of 3, which would have been 18 bytes. These 2 equal signs add no information whatsoever to the string, and can be discarded when storing.