Why does std::unordered_map have a reserve method?

Mikubyte picture Mikubyte · Feb 21, 2017 · Viewed 9.2k times · Source

According to this you cannot reserve space for std::map:

No, the members of the map are internally stored in a tree structure. There is no way to build the tree until you know the keys and values that are to be stored.

From this it is obvious why std::map would lack a reserve() method, which it does on cppreference.com. However, std::unordered_map does have a reserve() method, but when I try to use it with operator[], insert() or emplace() they all go to allocate memory despite me having called reserve() first.

What's up with this? Why won't reserve() properly reserve the required space? And if it's as with the map that you cannot allocate memory beforehand, then why does std::unordered_map even have a reserve() method in the first place?

Answer

ssell picture ssell · Feb 21, 2017

The unordered_map container has a reserve method because it is implemented using buckets, and not a tree as in map.

A bucket is:

a slot in the container's internal hash table to which elements are assigned based on the hash value of their key. Buckets are numbered from 0 to (bucket_count-1). (source)

A single bucket holds a variable number of items. This number is based on the load_factor. When the load_factor reaches a certain threshold, the container increases the number of buckets and rehashes the map.

When you call reserve(n), the container creates enough buckets to hold at least n items.

This is in contrast to rehash(n) which directly sets the number of buckets to n and triggers a rebuild of the entire hash table.

See also: Pre-allocating buckets in a C++ unordered_map

Edit in Response to Comments

As I do not know the exact answer to the question posed in the comments, and as my preliminary research did not prove fruitful, I decided to test it experimentally.

For reference, the question boils down to:

Could you please explain if reserving buckets for n elements is the same as allocating memory for n elements?

According to this answer, accurately retrieving the size of the allocated space in an unordered_map is tricky and unreliable. So I decided to make use of Visual Studio 2015's diagnostic tools.

First, my test case is as follows:

#include <unordered_map>
#include <cstdint>

struct Foo
{
    Foo() : x(0.0f), y(0.0f), z(0.0f) { }

    float x;
    float y;
    float z;
};

int32_t main(int32_t argc, char** argv)
{
    std::unordered_map<uint32_t, Foo> mapNoReserve;
    std::unordered_map<uint32_t, Foo> mapReserve;

    // --> Snapshot A

    mapReserve.reserve(1000);

    // --> Snapshot B

    for(uint32_t i = 0; i < 1000; ++i)
    {
        mapNoReserve.insert(std::make_pair(i, Foo()));
        mapReserve.insert(std::make_pair(i, Foo()));
    }

    // -> Snapshot C

    return 0;
}

Where the comments indicate, I took a memory snapshot.

The results were as follows:

Snapshot A:

┌──────────────┬──────────────┬──────────────┐
|     Map      | Size (Bytes) | Bucket Count |
|--------------|--------------|--------------|
| mapNoReserve | 64           | 8            |
| mapReserve   | 64           | 8            |
└──────────────┴──────────────┴──────────────┚

Snapshot B:

┌──────────────┬──────────────┬──────────────┐
|     Map      | Size (Bytes) | Bucket Count |
|--------------|--------------|--------------|
| mapNoReserve | 64           | 8            |
| mapReserve   | 8231         | 1024         |
└──────────────┴──────────────┴──────────────┚

Snapshot C:

┌──────────────┬──────────────┬──────────────┐
|     Map      | Size (Bytes) | Bucket Count |
|--------------|--------------|--------------|
| mapNoReserve | 24024        | 1024         |
| mapReserve   | 24024        | 1024         |
└──────────────┴──────────────┴──────────────┚

Interpretation:

As you can see from the snapshot, it appears that both maps grow in size once we start adding elements to them, even the one that had called reserve.

So does reserve offer a benefit even though memory is still allocated? I would say yes for two reasons: (1) It pre-allocates the memory for the buckets, and (2) it can prevent the need of a rehash, which as discussed earlier completely rebuilds the map.