How does the Hamming code work?

paxdiablo picture paxdiablo · Dec 23, 2008 · Viewed 17.7k times · Source

When transmitting data, the Hamming code apparently allows you to recreate data that has been corrupted over the wire (an error correcting code).

How does this work and what are its limitations, if any?

Are there any better solutions for error correction (as opposed to retransmission)? Are there circumstances where retransmission is better?

Answer

Toon Krijthe picture Toon Krijthe · Dec 23, 2008

Let's try to explain it a bit:

We have a 3 bit number. The possibilities can be presented as a cube where each bit represent an axis. The eight possibilities are on the corners.

000 --------001
 | \         | \
 | 100---------101
 |  |        |  |
 |  |        |  |
010-|-------011 |
   \|          \|
   110---------111

Each defect (for example 101 is read as 100) results in a shift on a line on the cube.

If we use two bits for the data and the third for a parity check (say for example even parity). We lose 4 datapoints. But it has the advantage that we can detect a single bit failure (which transforms an even count of 1's into an odd count of ones). The odd numbers are marked with a *. And we see that each odd (wrongly transmitted) word is cornered by even (rightfully transmitted) words. So if we receive 100, we can be sure it is wrong.

But (with a single bit failure) the correct representation could have been 000, 101 or 110. So we can detect something has been wrong but we cannot detect what was wrong:

 000 -------*001
  | \         | \
  |*100---------101
  |  |        |  |
  |  |        |  |
*010-|-------011 |
    \|          \|
    110--------*111

This is called a one bit error detecting code.

If we use another bit for checking and thus remove one for the data. We are left with 1 databit and 2 "check bits". In this case, lets assume 000 and 111 are valid data representations and the other six are not. Now we have an interesting situation if one bit is mangled during transport. If we send 000 and receive 010, we see that 010 has one valid neighbor (000) and two invalid ones (110 and 011). So now we know that we intended to send 000 and are able to correct that:

 000 -------*001
  | \         | \
  |*100--------*101
  |  |        |  |
  |  |        |  |
*010-|------*011 |
    \|          \|
   *110---------111

This is called a one bit error correcting code.

Please note that a one bit error correcting code is also a two bit error detecting code.

And to put it more generally.

If you have n check bits, you have a n bit error detecting code. If you have 2n check bits, you have a n bit error correcting code.

Of course you should order the "valid" codes so that they do not border on each other.