hash function for string

lilawood picture lilawood · Oct 5, 2011 · Viewed 239.5k times · Source

I'm working on hash table in C language and I'm testing hash function for string.

The first function I've tried is to add ascii code and use modulo (%100) but i've got poor results with the first test of data: 40 collisions for 130 words.

The final input data will contain 8 000 words (it's a dictionnary stores in a file). The hash table is declared as int table[10000] and contains the position of the word in a txt file.

The first question is which is the best algorithm for hashing string ? and how to determinate the size of hash table ?

thanks in advance !

:-)

Answer

cnicutar picture cnicutar · Oct 5, 2011

I've had nice results with djb2 by Dan Bernstein.

unsigned long
hash(unsigned char *str)
{
    unsigned long hash = 5381;
    int c;

    while (c = *str++)
        hash = ((hash << 5) + hash) + c; /* hash * 33 + c */

    return hash;
}