why is hash output fixed in length?

Alvida picture Alvida · Apr 13, 2015 · Viewed 8.4k times · Source

Hash functions always produce a fixed length output regardless of the input (i.e. MD5 >> 128 bits, SHA-256 >> 256 bits), but why?

I know that it is how the designer designed them to be, but why they designed the output to have the same length? So that it can be stored in a consistent fashion? easier to be compared? less complicated?

Answer

Alex picture Alex · Apr 13, 2015

Because that is what the definition of a hash is. Refer to wikipedia

A hash function is any function that can be used to map digital data of arbitrary size to digital data of fixed size.

If your question relates to why it is useful for a hash to be a fixed size there are multiple reasons (non-exhaustive list):

  • Hashes typically encode a larger (often arbitrary size) input into a smaller size, generally in a lossy way, i.e. unlike compression functions, you cannot reconstruct the input from the hash value by "reversing" the process.
  • Having a fixed size output is convenient, especially for hashes designed to be used as a lookup key.
  • You can predictably (pre)allocate storage for hash values and index them in a contiguous memory segment such as an array.
  • For hashes of "native word sizes", e.g. 16, 32 and 64 bit integer values, you can do very fast equality and ordering comparisons.
  • Any algorithm working with hash values can use a single set of fixed size operations for generating and handling them.
  • You can predictably combine hashes produced with different hash functions in e.g. a bloom filter.
  • You don't need to waste any space to encode how big the hash value is.

There do exist special hash functions, that are capable of producing an output hash of a specified fixed length, such as so-called sponge functions.