Robust and fast checksum algorithm?

Benedikt Waldvogel picture Benedikt Waldvogel · Sep 23, 2008 · Viewed 34.7k times · Source

Which checksum algorithm can you recommend in the following use case?

I want to generate checksums of small JPEG files (~8 kB each) to check if the content changed. Using the filesystem's date modified is unfortunately not an option.
The checksum need not be cryptographically strong but it should robustly indicate changes of any size.

The second criterion is speed since it should be possible to process at least hundreds of images per second (on a modern CPU).

The calculation will be done on a server with several clients. The clients send the images over Gigabit TCP to the server. So there's no disk I/O as bottleneck.

Answer

luke picture luke · Sep 23, 2008

If you have many small files, your bottleneck is going to be file I/O and probably not a checksum algorithm.

A list of hash functions (which can be thought of as a checksum) can be found here.

Is there any reason you can't use the filesystem's date modified to determine if a file has changed? That would probably be faster.