C++11 lambda implementation and memory model

Steve picture Steve · Aug 30, 2012 · Viewed 25.3k times · Source

I would like some information on how to correctly think about C++11 closures and std::function in terms of how they are implemented and how memory is handled.

Although I don't believe in premature optimisation, I do have a habit of carefully considering the performance impact of my choices while writing new code. I also do a fair amount of real-time programming, e.g. on microcontrollers and for audio systems, where non-deterministic memory allocation/deallocation pauses are to be avoided.

Therefore I'd like to develop a better understanding of when to use or not use C++ lambdas.

My current understanding is that a lambda with no captured closure is exactly like a C callback. However, when the environment is captured either by value or by reference, an anonymous object is created on the stack. When a value-closure must be returned from a function, one wraps it in std::function. What happens to the closure memory in this case? Is it copied from the stack to the heap? Is it freed whenever the std::function is freed, i.e., is it reference-counted like a std::shared_ptr?

I imagine that in a real-time system I could set up a chain of lambda functions, passing B as a continuation argument to A, so that a processing pipeline A->B is created. In this case, the A and B closures would be allocated once. Although I'm not sure whether these would be allocated on the stack or the heap. However in general this seems safe to use in a real-time system. On the other hand if B constructs some lambda function C, which it returns, then the memory for C would be allocated and deallocated repeatedly, which would not be acceptable for real-time usage.

In pseudo-code, a DSP loop, which I think is going to be real-time safe. I want to perform processing block A and then B, where A calls its argument. Both these functions return std::function objects, so f will be a std::function object, where its environment is stored on the heap:

auto f = A(B);  // A returns a function which calls B
                // Memory for the function returned by A is on the heap?
                // Note that A and B may maintain a state
                // via mutable value-closure!
for (t=0; t<1000; t++) {
    y = f(t)
}

And one which I think might be bad to use in real-time code:

for (t=0; t<1000; t++) {
    y = A(B)(t);
}

And one where I think stack memory is likely used for the closure:

freq = 220;
A = 2;
for (t=0; t<1000; t++) {
    y = [=](int t){ return sin(t*freq)*A; }
}

In the latter case the closure is constructed at each iteration of the loop, but unlike the previous example it is cheap because it is just like a function call, no heap allocations are made. Moreover, I wonder if a compiler could "lift" the closure and make inlining optimisations.

Is this correct? Thank you.

Answer

Nicol Bolas picture Nicol Bolas · Aug 30, 2012

My current understanding is that a lambda with no captured closure is exactly like a C callback. However, when the environment is captured either by value or by reference, an anonymous object is created on the stack.

No; it is always a C++ object with an unknown type, created on the stack. A capture-less lambda can be converted into a function pointer (though whether it is suitable for C calling conventions is implementation dependent), but that doesn't mean it is a function pointer.

When a value-closure must be returned from a function, one wraps it in std::function. What happens to the closure memory in this case?

A lambda isn't anything special in C++11. It's an object like any other object. A lambda expression results in a temporary, which can be used to initialize a variable on the stack:

auto lamb = []() {return 5;};

lamb is a stack object. It has a constructor and destructor. And it will follow all of the C++ rules for that. The type of lamb will contain the values/references that are captured; they will be members of that object, just like any other object members of any other type.

You can give it to a std::function:

auto func_lamb = std::function<int()>(lamb);

In this case, it will get a copy of the value of lamb. If lamb had captured anything by value, there would be two copies of those values; one in lamb, and one in func_lamb.

When the current scope ends, func_lamb will be destroyed, followed by lamb, as per the rules of cleaning up stack variables.

You could just as easily allocate one on the heap:

auto func_lamb_ptr = new std::function<int()>(lamb);

Exactly where the memory for the contents of a std::function goes is implementation-dependent, but the type-erasure employed by std::function generally requires at least one memory allocation. This is why std::function's constructor can take an allocator.

Is it freed whenever the std::function is freed, i.e., is it reference-counted like a std::shared_ptr?

std::function stores a copy of its contents. Like virtually every standard library C++ type, function uses value semantics. Thus, it is copyable; when it is copied, the new function object is completely separate. It is also moveable, so any internal allocations can be transferred appropriately without needing more allocating and copying.

Thus there is no need for reference counting.

Everything else you state is correct, assuming that "memory allocation" equates to "bad to use in real-time code".