Fast hash function with collision possibility near SHA-1

Stig picture Stig · Feb 22, 2015 · Viewed 9.8k times · Source

I'm using SHA-1 to detect duplicates in a program handling files. It is not required to be cryptographic strong and may be reversible. I found this list of fast hash functions https://code.google.com/p/xxhash/

What do I choose if I want a faster function and collision on random data near to SHA-1?

Maybe a 128 bit hash is good enough for file deduplication? (vs 160 bit sha-1)

In my program the hash is calculated on chuncks from 0 - 512 KB.

Answer

A. Binzxxxxxx picture A. Binzxxxxxx · Apr 8, 2015

Maybe this will help you: https://softwareengineering.stackexchange.com/questions/49550/which-hashing-algorithm-is-best-for-uniqueness-and-speed

collisions rare: FNV-1, FNV-1a, DJB2, DJB2a, SDBM & MurmurHash

I don't know about xxHash but it looks also promising.

MurmurHash is very fast and version 3 supports 128bit length, I would choose this one. (Implemented in Java and Scala.)