LLVM vs. C-- ; how can LLVM fundamentally not be better for Haskell than C--?

dr.addn picture dr.addn · May 3, 2009 · Viewed 8.2k times · Source

I've been excited about LLVM being low enough to model any system, and saw it as promising that Apple was adopting it; but then again Apple doesn't specifically support Haskell;

And, some think that Haskell would be better off with C--:

That LLVM'ers haven't solved the problem of zero-overhead garbage collection isn't too surprising. Solving this while staying agnostic of the data model is an open question in computer science.

-- LHC won't be using LLVM.

Answer

Edward Z. Yang picture Edward Z. Yang · Apr 17, 2011

Having worked a bit with the new code generation backend which manipulates C--, I can say there are a number of reasons why C-- can be better than LLVM, and also why they’re not really at all the same thing.

  1. C-- operates at a higher level of abstraction than LLVM; for example, we can generate code in C-- where the stack pointer is entirely implicit, and only manifest it later during the compilation process. This makes applying certain types of optimizations much easier, because the higher level representation allows for more code motion with less invariants.

  2. While we’re actively looking to fix this, LLVM suffers from the same problem that the via-C backend suffered: it requires us to create proc points. What are proc points? Essentially, because Haskell does not use the classic call/ret calling convention, whenever we make the moral equivalent of a subprocedure call, we need to push a continuation onto the stack and then jump to the subprocedure. This continuation is usually a local label, but LLVM requires it to be an actual procedure, so we need to break functions into smaller pieces (each piece being called a proc point). This is bad news for optimizations, which work on a procedure-level.

  3. C-- and LLVM take a different approach to dataflow optimization. LLVM uses traditional SSA style with phi-nodes: C-- uses a cool framework called Hoopl which doesn’t require you to maintain the SSA invariant. I can confirm: programming optimizations in Hoopl is a lot of fun, though certain types of optimizations (inlining of one-time used variables comes to mind) are not exactly the most natural in this dataflow setting.