How are hash tables implemented internally in popular languages?

CDR picture CDR · May 24, 2009 · Viewed 8.4k times · Source

Can someone please shed some light on how popular languages like Python, Ruby implements hash tables internally for symbol lookup? Do they use the classic "array with linked-list" method, or use a balanced tree?

I need a simple (fewer LOC) and fast method for indexing the symbols in a DSL written in C. Was wondering what others have found most efficient and practical.

Answer

zvr picture zvr · May 24, 2009

The classic "array of hash buckets" you mention is used in every implementation I've seen.

One of the most educative versions is the hash implementation in the Tcl language, in file tcl/generic/tclHash.c. More than half of the lines in the file are comments explaining everything in detail: allocation, search, different hash table types, strategies, etc. Sidenote: the code implementating the Tcl language is really readable.