How Do I Choose Between a Hash Table and a Trie (Prefix Tree)?

Justin Bozonier picture Justin Bozonier · Oct 29, 2008 · Viewed 54.9k times · Source

So if I have to choose between a hash table or a prefix tree what are the discriminating factors that would lead me to choose one over the other. From my own naive point of view it seems as though using a trie has some extra overhead since it isn't stored as an array but that in terms of run time (assuming the longest key is the longest english word) it can be essentially O(1) (in relation to the upper bound). Maybe the longest english word is 50 characters?

Hash tables are instant look up once you get the index. Hashing the key to get the index however seems like it could easily take near 50 steps.

Can someone provide me a more experienced perspective on this? Thanks!

Answer

Darius Bacon picture Darius Bacon · Oct 29, 2008

Advantages of tries:

The basics:

  • Predictable O(k) lookup time where k is the size of the key
  • Lookup can take less than k time if it's not there
  • Supports ordered traversal
  • No need for a hash function
  • Deletion is straightforward

New operations:

  • You can quickly look up prefixes of keys, enumerate all entries with a given prefix, etc.

Advantages of linked structure:

  • If there are many common prefixes, the space they require is shared.
  • Immutable tries can share structure. Instead of updating a trie in place, you can build a new one that's different only along one branch, elsewhere pointing into the old trie. This can be useful for concurrency, multiple simultaneous versions of a table, etc.
  • An immutable trie is compressible. That is, it can share structure on the suffixes as well, by hash-consing.

Advantages of hashtables:

  • Everyone knows hashtables, right? Your system will already have a nice well-optimized implementation, faster than tries for most purposes.
  • Your keys need not have any special structure.
  • More space-efficient than the obvious linked trie structure (see comments below)