What is the benefit of Keyword Lists?

greggreg picture greggreg · Jan 27, 2015 · Viewed 9.6k times · Source

In elixir we have Maps:

> map = %{:a => "one", :b => "two"} # = %{a: "one", b: "two"}
> map.a                             # = "one"
> map[:a]                           # = "one"

We also have Keyword Lists:

> kl = [a: "one", b: "two"]       # = [a: "one", b: "two"]
> kl2 = [{:a, "one"},{:b, "two"}] # = [a: "one", b: "two"]
> kl == kl2                       # = true
> kl[:a]                          # = "one"
> kl.a                            # = ** (ArgumentError)

Why both?

Syntax? Is it because Keyword Lists have a more flexible syntax allowing them to be defined without curlys and even without brackets as the last param of a function call? Then why not give Maps this syntactic sugar?

Duplicate Keys? Is it because Keyword Lists can have duplicate keys? Why would you want both Map style access and duplicate keys?

Performance? Is it because Keyword Lists have better performance? Then why have Maps? And shouldn't maps be more performant at looking up members by key than a list of tuples?

JS Array and Ruby Hash like appearance? Is that it?

I understand that structurally they are different data representations. To me it seems that Keyword Lists in elixir serve to complicate the language through exceptional syntax (3 different syntactic variants), use case overlap with maps, and an unclear benefit.

What is the benefit of using Keyword Lists?

Answer

Patrick Oscity picture Patrick Oscity · Jan 27, 2015
                   ┌──────────────┬────────────┬───────────────────────┐
                   │ Keyword List │ Map/Struct │ HashDict (deprecated) │
┌──────────────────┼──────────────┼────────────┼───────────────────────┤
│ Duplicate keys   │ yes          │ no         │ no                    │
│ Ordered          │ yes          │ no         │ no                    │
│ Pattern matching │ yes          │ yes        │ no                    │
│ Performance¹     │ —            │ —          │ —                     │
│ ├ Insert         │ very fast²   │ fast³      │ fast⁴                 │
│ └ Access         │ slow⁵        │ fast³      │ fast⁴                 │
└──────────────────┴──────────────┴────────────┴───────────────────────┘

Keyword lists are lightweight and have a simple structure underneath them, which makes them very flexible. You can think of them as syntax sugar on top of an Erlang convention, making it easy to interface with Erlang without writing too ugly code. For example, keyword lists are used to represent function arguments, which is a property inherited from Erlang. In some cases, keyword lists are your only choice, especially if you need duplicate keys or ordering. They simply have different properties than the other alternatives, which make them more suitable for some situations and less for others.

Maps (and Structs) are used to store actual payload data, since they have a hash-based implementation. Keyword lists internally are just lists that need to be traversed for each operation, so they don't have the properties of classic key-value data structures like constant time access.

Elixir also introduced HashDict as a workaround for the poor performance of maps at the time it was written. However, this is fixed now as of Elixir 1.0.5/Erlang 18.0 and HashDict will be deprecated in future versions.

If you dig deeper into the Erlang standard library, there are even more data structures that store key/value pairs:

  • proplists – similar to Elixir keyword lists
  • maps – same as Elixir maps
  • dict – key-value dictionaries built from Erlang primitives
  • gb_trees – general balanced tree

You also have these options when you need to store key/value pairs across multiple processes and/or VMs:

  • ets/dets – (disk based) Erlang term storage
  • mnesia – distributed database

¹ Generally speaking, but of course it depends™.

² Best case is just prepending to a list.

³ Applies to Elixir 1.0.5 and above, may be slower in older versions.

HashDict is now being deprecated.

⁵ Requires a linear search which on average scans half of the elements.