How does your favorite language handle deep recursion?

jettero picture jettero · Oct 24, 2008 · Viewed 7.2k times · Source

I recently started learning Python and I was rather surprised to find a 1000 deep recursion limit (by default). If you set it high enough, about 30000, it crashes with a segmentation fault just like C. Although, C seems to go quite a lot higher.

(The Python folks are quick to point out that you can always convert recursive functions to iterative ones and that they're always faster. That's 100% true. It's not really what my question is about though.)

I tried the same experiment in Perl and somewhere around 10 million recursions it consumed all of my 4 gigs of ram and I used ^C to stop trying. Clearly Perl doesn't use the C stack, but it does use a ridiculous amount of memory when it recurses -- not terribly shocking considering how much work it has to do to call functions.

I tried in Pike and was completely surprised to get 100,000,000 recursions in about 2 seconds. I have no idea how it did that, but I suspect it flattened the recursion to an iterative process -- it doesn't seem to consume any extra memory while it does it. [Note: Pike does flatten trivial cases, but segfaults on more complicated ones, or so I'm told.]

I used these otherwise useless functions:

int f(int i, int l) { if(i<l) return f(i+1,l); return i; }

sub f { return f($_[0]+1, $_[1]) if $_[0]<$_[1]; return $_[0] };

def f(i,l):
   if i<l:
     return f(i+1,l)
   return i

I'm very curious how other languages (e.g., PHP, Ruby, Java, Lua, Ocaml, Haskell) handle recursion and why they handle it that way. Additionally, please note whether it makes a difference if the function is "tail-recursive" (see comment).

Answer

Steve Jessop picture Steve Jessop · Oct 24, 2008

"The Python folks are quick to point out that you can always convert recursive functions to iterative ones and that they're always faster"

This is true, but if it's really as easy as all that, why doesn't Python do it for me, so that my code can look as simple as possible? (I say this not to slam Python implementers, but because the answer explains the problem).

Recursion optimisations have been present in functional languages since, like, the 14th century or something. Haskell, CAML, Lisp implementations all typically convert at least tail recursive functions to iterations: you basically do this by spotting that it's possible, i.e. that the function can be rearranged such that no local variables other than the return value are used after the recursive call. One trick to make it possible if there's some work done to the recursed return value before return, is to introduce an additional "accumulator" parameter. In simple terms this means the work can effectively be done on the way "down" instead of on the way "up": search around for 'how to make a function tail-recursive' for details.

The actual details of turning a tail recursive function into a loop is basically to jigger with your call convention such you can "perform the call" simply by setting up the parameters and jumping back to the start of the function, without bothering to save all that stuff in scope that you know you won't ever use. In assembler terms, you don't have to preserve caller-saves registers if data-flow analysis tells you they're unused beyond the call, and the same goes for anything on the stack: you don't have to move the stack pointer on a call if you don't mind "your" bit of stack being scribbled over by the next recursion/iteration.

Contrary to how you paraphrased the Python folks, converting a general recursive function to an iteration is not trivial: for example if it's multiply-recursive then in a simple approach you'd still need a stack.

Memoization is a useful technique, though, for arbitrarily recursive functions, that you might like to look up if you're interested in the possible approaches. What it means is that each time a function is evaluated, you stick the result in a cache. To use this to optimize recursion, basically, if your recursive function counts "down", and you memoize it, then you can evaluate it iteratively by adding a loop that counts "up" calculating each value of the function in turn until you reach the target. This uses very little stack space provided that the memo cache is big enough to hold all the values you'll need: for instance if f(n) depends on f(n-1), f(n-2) and f(n-3) you only need space for 3 values in the cache: as you go up you can kick the ladder away. If f(n) depends on f(n-1) and f(n/2), you need lots of space in the cache, but still less than you'd use for stack in an unoptimised recursion.