I am trying to start using erlang:trace/3
and the dbg
module to trace the behaviour of a live production system without taking the server down.
The documentation is opaque (to put it mildly) and there don't appear to be any useful tutorials online.
What I spent all day trying to do was capture what was happening in a particular function by trying to apply a trace to Module:Function
using dbg:c
and dbg:p
but with no success at all.
Does anyone have a succinct explanation of how to use trace in a live Erlang system?
The basic steps of tracing for function calls are on a non-live node:
> dbg:start(). % start dbg
> dbg:tracer(). % start a simple tracer process
> dbg:tp(Module, Function, Arity, []). % specify MFA you are interested in
> dbg:p(all, c). % trace calls (c) of that MFA for all processes.
... trace here
> dbg:stop_clear(). % stop tracer and clear effect of tp and p calls.
You can trace for multiple functions at the same time. Add functions by calling tp
for each function. If you want to trace for non-exported functions, you need to call tpl
. To remove functions, call ctp
or ctpl
in a similar manner. Some general tp calls are:
> dbg:tpl(Module, '_', []). % all calls in Module
> dbg:tpl(Module, Function, '_', []). % all calls to Module:Function with any arity.
> dbg:tpl(Module, Function, Arity, []). % all calls to Module:Function/Arity.
> dbg:tpl(M, F, A, [{'_', [], [{return_trace}]}]). % same as before, but also show return value.
The last argument is a match specification. You can play around with that by using dbg:fun2ms
.
You can select the processes to trace on with the call to p(). The items are described under erlang:trace. Some calls are:
> dbg:p(all, c). % trace calls to selected functions by all functions
> dbg:p(new, c). % trace calls by processes spawned from now on
> dbg:p(Pid, c). % trace calls by given process
> dbg:p(Pid, [c, m]). % trace calls and messages of a given process
I guess you will never need to directly call erlang:trace
, as dbg
does pretty much everything for you.
A golden rule for a live node is to generate only an amount of trace output to the shell, which lets you to type in dbg:stop_clear().
. :)
I often use a tracer that will auto-stop itself after a number of events. For example:
dbg:tracer(process, {fun (_,100) -> dbg:stop_clear();
(Msg, N) -> io:format("~p~n", [Msg]), N+1 end, 0
}).
If you are looking for debugging on remote nodes (or multiple nodes), search for pan
, eper
, inviso
or onviso
.