I'm using the following function to compute the CRC32 of a file in a VS2008, .NET 3.5 project:
public UInt32 ComputeHash(System.IO.Stream stream)
{
unchecked
{
const int BUFFER_SIZE = 1024;
UInt32 crc32Result = 0xFFFFFFFF;
byte[] buffer = new byte[BUFFER_SIZE];
int count = stream.Read(buffer, 0, BUFFER_SIZE);
while (count > 0)
{
for (int i = 0; i < count; i++)
{
crc32Result = ((crc32Result) >> 8) ^ _crc32Table[(buffer[i]) ^ (crc32Result) & _LOOKUP_TABLE_MAX_INDEX];
}
count = stream.Read(buffer, 0, BUFFER_SIZE);
}
return ~crc32Result;
}
}
For the sake of brevity, I have left out the function that builds the lookup table (_crc32Table). The table is an array of UInt32, is built when the class is instantiated, and contains 256 values (256 is also the value of _LOOKUP_TABLE_MAX_INDEX + 1).
I have run some benchmarks comparing this to the MD5CryptoServiceProvider and SHA1CryptoServiceProvider ComputeHash functions and they are much faster. The MD5 function is over twice as fast and the SHA1 hash is about 35% faster. I was told CRC32 is fast, but that's not what I'm seeing.
Am I mistaken in my assumptions? Is this to be expected or is there a flaw in this algorithm?
You are comparing your code to built in functions and asking why they are faster. What you need to do is find the source for the built in functions. How do they work? See what's different.
Betcha the built in functions call out to a native library and cheat by not having to run inside the managed memory framework.