Lightweight memory leak debugging on linux

glagolig picture glagolig · Aug 27, 2013 · Viewed 27.3k times · Source

I looked for existing answers first and saw that Valgrind is everyone’s favorite tool for memory leak debugging on linux. Unfortunately Valgrind does not seem to work for my purposes. I will try to explain why.

Constraints:

  • The leak reproduces only in customer’s environment. Due to certain legal restrictions we have to work with existing binary. No rebuilds.
  • In regular environment our application consumes ~10% CPU. Say, we can tolerate up to 10x CPU usage increase. Valgrind with default memcheck settings does much worse making our application unresponsive for long periods of time.

What I need is an equivalent of Microsoft’s UMDH: turn on stack tracing for each heap allocation, then at certain point of time dump all allocations grouped by stacks and ordered by allocation count in descending order. Our app ships on both Windows and Linux platforms, so I know that performance on Windows under UMDH is still tolerable.

Here are the tools/methods I considered

  • Valgrind's -memcheck and –massif tools They do much more than needed (like scanning whole process memory for every allocation pointer), they are too slow, and they still don’t do exactly what I
    need (dump callstacks sorted by counts), so I will have to write some scripts parsing the output
  • dmalloc library (dmalloc.com) requires new binary
  • LeakTracer (http://www.andreasen.org/LeakTracer/) Works only with C++ new/delete (I need malloc/free as well), does not have group-by-stack and sort functionality
  • Implementing the tool myself as .so library using LD_PRELOAD mechanism (Overriding 'malloc' using the LD_PRELOAD mechanism) That will take at least a week given my coding-for-Linux skills and it feels like inventing a bicycle.

Did I miss anything? Are there any lightweight Valgrind options or existing LD_PRELOAD tool?

Answer

DanielKO picture DanielKO · Aug 27, 2013

GNU libc has built-in malloc debugging:

http://www.gnu.org/software/libc/manual/html_node/Allocation-Debugging.html

Use LD_PRELOAD to call mtrace() from your own .so:

#include <mcheck.h>
static void prepare(void) __attribute__((constructor));
static void prepare(void)
{
    mtrace();
}

Compile it with:

gcc -shared -fPIC dbg.c -o dbg.so

Run it with:

export MALLOC_TRACE=out.txt
LD_PRELOAD=./dbg.so ./my-leaky-program

Later inspect the output file:

mtrace ./my-leaky-program out.txt

And you will get something like:

Memory not freed:
-----------------
           Address     Size     Caller
0x0000000001bda460     0x96  at /tmp/test/src/test.c:7

Of course, feel free to write your own malloc hooks that dump the entire stack (calling backtrace() if you think that's going to help).

Lines numbers and/or function names will be obtainable if you kept debug info for the binary somewhere (e.g. the binary has some debug info built in, or you did objcopy --only-keep-debug my-leaky-program my-leaky-program.debug).


Also, you could try Boehm's GC, it works as a leak detector too:

http://www.hpl.hp.com/personal/Hans_Boehm/gc/leak.html