Is there a more efficient implementation for a bidirectional map?

Vittorio Romeo picture Vittorio Romeo · Feb 13, 2014 · Viewed 32k times · Source

I created a simple bidirectional map class that works by internally storing two std::map instances, with opposite key/value types, and providing an user-friendly interface:

template<class T1, class T2> class Bimap
{
    std::map<T1, T2> map1;
    std::map<T2, T1> map2;
    // ...
};
  • Is there a more efficient method of implementing a bidirectional map that doesn't require twice the memory?

  • How is a bimap usually implemented?


EDIT:

  • Should bimap element be mutable or immutable? (Changing one element in map1 should change the key in map2, but keys are const and that's impossible - what's the solution?)

  • Ownership of elements is also another problem: when an user inserts a key-value pair in the bimap, the bimap should make a copy of that key-value pair and store it, then the internal second map (with inverted key/value) should not copy but point to the original pair. How can this be achieved?


EDIT 2:

I've posted a possible implementation I made on Code Review.

Answer

example picture example · Feb 20, 2014

There is a certain problem with double-storing your data in all simple implementations of a bimap. If you can break it down to a bimap of pointers from outside, then you can readily ignore this and simply keep both maps of the form std::map<A*,B*> like Arkaitz Jimenez already suggested (though contrary to his answer you have to care about the storage from outside to avoid a A->A* lookup). But if you have the pointers anyway, why not simply store a std::pair<A,B> at the point where you would otherwise store A and B separately?

It would be nice to have std::map<A,B*> instead of std::map<A*,B*> as this would allow for example the lookup of an element associated to an string by a newly created string with the same content instead of the pointer to the original string that created the pair. But it is customary to store a full copy of the key with every entry and only rely on the hash to find the right bucket. This way the returned item will be the correct one even in the case of a hash-collision...

If you want to have it quick and dirty though, there is this

hackish solution:

Create two maps std::map<size_t, A> mapA and std::map<size_t, B> mapB. Upon insertion hash both elements that are to be inserted to get the keys to the respective maps.

void insert(const A &a, const B &b) {
    size_t hashA = std::hash<A>(a);
    size_t hashB = std::hash<B>(b);

    mapA.insert({hashB, a});
    mapB.insert({hashA, b});
}

Lookup is implemented analogously.

Using a multimap instead of a map and verifying every element you get with a lookup in the respectively other map (get candidate b from mapA, hash b and look in mapB if it matches the wanted key, iterate to the next candidate b otherwise) this is a valid implementation - but still hackish in my opinion...

You can get a much nicer solution by using the copies of the elements that are used to compare the entries (see above) as only storage. It is a bit harder to get your head around that though. To elaborate:

a nicer solution:

Create two sets of pairs as std::set<pair<A, B*>> and std::set<pair<B, A*>> and overload the operator< and operator== to only take the first element of the pairs into account (or provide an corresponding comparion class). It is necessary to create sets of pairs instead of maps (which internally look similarly) because we need a guarantee that A and B will be at constant positions in memory. Upon insertion of an pair<A, B> we split it into two elements that fit into the above sets.

  std::set<pair<B, A*>> mapA;
  std::set<pair<A, B*>> mapB;

  void insert(const A &a, const B &b) {
      auto aitr = mapA.insert({b, nullptr}).first; // creates first pair
      B *bp = &(aitr->first);  // get pointer of our stored copy of b
      auto bitr = mapB.insert({a, bp}).first; 
      // insert second pair {a, pointer_to_b}
      A *ap = &(bitr->first);  // update pointer in mapA to point to a
      aitr->second = ap;
  }

Lookup can now simply be done by a simple std::set lookup and a pointer dereference.

This nicer solution is similar to the solution that boost uses - even though they use some annonymized pointers as second elements of the pairs and thus have to use reinterpret_casts.

Note that the .second part of the pairs need to be mutable (so I'm not sure std::pair can be used), or you have to add another layer of abstraction (std::set<pair<B, A**>> mapA) even for this simple insertion. In both solutions you need temporary elements to return non-const references to elements.