get list of anagrams from a dictionary

vijay picture vijay · Jun 19, 2012 · Viewed 18.8k times · Source

Basically, Anagrams are like permutation of string.E.g stack ,sackt ,stakc all are anagrams of stack (thought above words aren't meaningful). Anyways you could have understood what I basically meant.

Now, I want a list of anagrams given million words or simply say from a dictionary.

My basic question is Find total number of unique anagrams in a dictionary?

Sorting and comparing won't work as it's time complexity is pretty bad.

I thought of using hash table, string as key.

But the problem is what should be the hash function ? It would be helpful if some pseudocode provided. Some other approaches better than mentioned approaches would also be helpful.

Thanks.

Answer

wildplasser picture wildplasser · Jun 20, 2012

The obvious solution is to map each character to a prime number and multiply the prime numbers. So if 'a'' -> 2 and 'b' -> 3, then

  • 'ab' -> 6
  • 'ba' -> 6
  • 'bab' -> 18
  • 'abba' -> 36
  • 'baba' -> 36

To minimise the chance of overflow, the smallest primes could be assigned to the more frequent letters (e,t,i,a,n). Note: The 26th prime is 101.

UPDATE: an implementation can be found here