Why can't tail calls be optimized in JVM-based Lisps?

Mars picture Mars · Oct 19, 2013 · Viewed 7.4k times · Source

Main question: I view the most significant application of tail call optimization (TCO) as a translation of a recursive call into a loop (in cases in which the recursive call has a certain form). More precisely, when translated into a machine language, this would usually be translation into some sort of series of jumps. Some Common Lisp and Scheme compilers that compile to native code (e.g. SBCL) can identify tail-recursive code and perform this translation. JVM-based Lisps such as Clojure and ABCL have trouble doing this. What is it about the JVM as a machine that prevents or makes this difficult? I don't get it. The JVM obviously has no problem with loops. It's the compiler that has to figure out how to do TCO, not the machine to which it compiles.

Related question: Clojure can translate seemingly recursive code into a loop: It acts as if it's performing TCO, if the programmer replaces the tail call to the function with the keyword recur. But if it's possible to get a compiler to identify tail calls--as SBCL and CCL do, for example--then why can't the Clojure compiler figure out that it's supposed to treat a tail call the way it treats recur?

(Sorry--this is undoubtably a FAQ, and I'm sure that the remarks above show my ignorance, but I was unsuccessful in finding earlier questions.)

Answer

Michał Marczyk picture Michał Marczyk · Oct 19, 2013

Real TCO works for arbitrary calls in tail position, not just self calls, so that code like the following does not cause a stack overflow:

(letfn [(e? [x] (or (zero? x) (o? (dec x))))
        (o? [x] (e? (dec x)))]
  (e? 10))

Clearly you'd need JVM support for this, since programs running on the JVM cannot manipulate the call stack. (Unless you were willing to establish your own calling convention and impose the associated overhead on function calls; Clojure aims to use regular JVM method calls.)

As for eliminating self calls in tail position, that's a simpler problem which can be solved as long as the entire function body gets compiled to a single JVM method. That is a limiting promise to make, however. Besides, recur is fairly well liked for its explicitness.