What is the best 32bit hash function for relatively short strings?
Strings are tag names that consist of English letters, numbers, spaces and some additional characters (#
, $
, .
, ...). For example: Unit testing
, C# 2.0
.
I am looking for 'best' as in 'minimal collisions', performance is not important for my goals.
I'm not sure if it's the best choice, but here is a hash function for strings:
The Practice of Programming (HASH TABLES, pg. 57)
/* hash: compute hash value of string */
unsigned int hash(char *str)
{
unsigned int h;
unsigned char *p;
h = 0;
for (p = (unsigned char*)str; *p != '\0'; p++)
h = MULTIPLIER * h + *p;
return h; // or, h % ARRAY_SIZE;
}
Empirically, the values 31 and 37 have proven to be good choices for the multiplier in a hash function for ASCII strings.