Binary Search Tree Implementation in C++ STL?

Bunkai.Satori picture Bunkai.Satori · Feb 23, 2011 · Viewed 56.2k times · Source

Do you know, please, if C++ STL contains a Binary Search Tree (BST) implementation, or if I should construct my own BST object?

In case STL conains no implementation of BST, are there any libraries available?

My goal is to be able to find the desired record as quickly as possible: I have a list of records (it should not be more few thousands.), and I do a per-frame (its a computer game) search in that list. I use unsigned int as an identifier of the record of my interest. Whatever way is the fastest will work best for me.

Answer

sbi picture sbi · Feb 23, 2011

What you need is a way to look up some data given a key. With the key being an unsigned int, this gives you several possibilities. Of course, you could use a std::map:

typedef std::map<unsigned int, record_t> my_records;

However, there's other possibilities as well. For example, it's quite likely that a hash map would be even faster than a binary tree. Hash maps are called unordered_map in C++, and are a part of the C++11 standard, likely already supported by your compiler/std lib (check your compiler version and documentation). They were first available in C++TR1 (std::tr1::unordered_map)

If your keys are rather closely distributed, you might even use a simple array and use the key as an index. When it comes to raw speed, nothing would beat indexing into an array. OTOH, if your key distribution is too random, you'd be wasting a lot of space.

If you store your records as pointers, moving them around is cheap, and an alternative might be to keep your data sorted by key in a vector:

typedef std::vector< std::pair<unsigned int, record_t*> > my_records;

Due to its better data locality, which presumably plays nice with processor cache, a simple std::vector often performs better than other data structures which theoretically should have an advantage. Its weak spot is inserting into/removing from the middle. However, in this case, on a 32bit system, this would require moving entries of 2*32bit POD around, which your implementation will likely perform by calling CPU intrinsics for memory move.