Data Length vs CRC Length

Robert picture Robert · Feb 23, 2010 · Viewed 69k times · Source

I've seen 8-bit, 16-bit, and 32-bit CRCs.

At what point do I need to jump to a wider CRC?

My gut reaction is that it is based on the data length:

  1. 1-100 bytes: 8-bit CRC
  2. 101 - 1000 bytes: 16-bit CRC
  3. 1001 - ??? bytes: 32-bit CRC

EDIT: Looking at the Wikipedia page about CRC and Lott's answer, here' what we have:

<64 bytes: 8-bit CRC

<16K bytes: 16-bit CRC

<512M bytes: 32-bit CRC

Answer

S.Lott picture S.Lott · Feb 23, 2010

It's not a research topic. It's really well understood: http://en.wikipedia.org/wiki/Cyclic_redundancy_check

The math is pretty simple. An 8-bit CRC boils all messages down to one of 256 values. If your message is more than a few bytes long, the possibility of multiple messages having the same hash value goes up higher and higher.

A 16-bit CRC, similarly, gives you one of the 65,536 available hash values. What are the odds of any two messages having one of these values?

A 32-bit CRC gives you about 4 billion available hash values.

From the wikipedia article: "maximal total blocklength is equal to 2**r − 1". That's in bits. You don't need to do much research to see that 2**9 - 1 is 511 bits. Using CRC-8, multiple messages longer than 64 bytes will have the same CRC checksum value.