Hash Code and Checksum - what's the difference?

Richard Ev picture Richard Ev · Jan 20, 2009 · Viewed 51.9k times · Source

My understanding is that a hash code and checksum are similar things - a numeric value, computed for a block of data, that is relatively unique.

i.e. The probability of two blocks of data yielding the same numeric hash/checksum value is low enough that it can be ignored for the purposes of the application.

So do we have two words for the same thing, or are there important differences between hash codes and checksums?

Answer

Zach Scrivena picture Zach Scrivena · Jan 20, 2009

I would say that a checksum is necessarily a hashcode. However, not all hashcodes make good checksums.

A checksum has a special purpose --- it verifies or checks the integrity of data (some can go beyond that by allowing for error-correction). "Good" checksums are easy to compute, and can detect many types of data corruptions (e.g. one, two, three erroneous bits).

A hashcode simply describes a mathematical function that maps data to some value. When used as a means of indexing in data structures (e.g. a hash table), a low collision probability is desirable.