Efficient linked list in C++?

Leedehai picture Leedehai · Aug 16, 2017 · Viewed 8.3k times · Source

This document says std::list is inefficient:

std::list is an extremely inefficient class that is rarely useful. It performs a heap allocation for every element inserted into it, thus having an extremely high constant factor, particularly for small data types.

Comment: that is to my surprise. std::list is a doubly linked list, so despite its inefficiency in element construction, it supports insert/delete in O(1) time complexity, but this feature is completely ignored in this quoted paragraph.

My question: Say I need a sequential container for small-sized homogeneous elements, and this container should support element insert/delete in O(1) complexity and does not need random access (though supporting random access is nice, it is not a must here). I also don't want the high constant factor introduced by heap allocation for each element's construction, at least when the number of element is small. Lastly, iterators should be invalidated only when the corresponding element is deleted. Apparently I need a custom container class, which might (or might not) be a variant of doubly linked list. How should I design this container?

If the aforementioned specification cannot be achieved, then perhaps I should have a custom memory allocator, say, bump pointer allocator? I know std::list takes an allocator as its second template argument.

Edit: I know I shouldn't be too concerned with this issue, from an engineering standpoint - fast enough is good enough. It is just a hypothetical question so I don't have a more detailed use case. Feel free to relax some of the requirements!

Edit2: I understand two algorithms of O(1) complexity can have entirely different performance due to the difference in their constant factors.

Answer

Useless picture Useless · Aug 16, 2017

Your requirements are exactly those of std::list, except that you've decided you don't like the overhead of node-based allocation.

The sane approach is to start at the top and only do as much as you really need:

  1. Just use std::list.

    Benchmark it: is the default allocator really too slow for your purposes?

    • No: you're done.

    • Yes: goto 2

  2. Use std::list with an existing custom allocator such as the Boost pool allocator

    Benchmark it: is the Boost pool allocator really too slow for your purposes?

    • No: you're done.

    • Yes: goto 3

  3. Use std::list with a hand-rolled custom allocator finely tuned to your unique needs, based on all the profiling you did in steps 1 and 2

    Benchmark as before etc. etc.

  4. Consider doing something more exotic as a last resort.

    If you get to this stage, you should have a really well-specified SO question, with lots of detail about exactly what you need (eg. "I need to squeeze n nodes into a cacheline" rather than "this doc said this thing is slow and that sounds bad").


PS. The above makes two assumptions, but both are worth investigation:

  1. as Baum mit Augen points out, it's not sufficient to do simple end-to-end timing, because you need to be sure where your time is going. It could be the allocator itself, or cache misses due to the memory layout, or something else. If something's slow, you still need to be sure why before you know what ought to change.
  2. your requirements are taken as a given, but finding ways to weaken requirements is often the easiest way to make something faster.

    • do you really need constant-time insertion and deletion everywhere, or only at the front, or the back, or both but not in the middle?
    • do you really need those iterator invalidation constraints, or can they be relaxed?
    • are there access patterns you can exploit? If you're frequently removing an element from the front and then replacing it with a new one, could you just update it in-place?