I've written a Brainfuck implementation (C++) that works like this:
This is pretty fast, but the bottleneck is now at the VM. It's written in C++ and reads a token, executes an action (which aren't many at all, if you know Brainfuck) and so on.
What I want to do is strip out the VM and generate native machine code on the fly (so basicly, a JIT compiler). This can easily be a 20x speedup.
This would mean step 3 gets replaced by a JIT compiler and step 4 with the executing of the generated machine code.
I don't know really where to start, so I have a few questions:
Generated machine code is just jmp
-ed to or call
-ed as usual function. Sometimes it also needed to disable no-execution flag (NX bit) on memory, containing generated code. In linux, this is done with mprotect(addr, size, PROT_READ | PROT_WRITE | PROT_EXEC.)
In windows the NX is called DEP.
There are some... E.g. http://www.gnu.org/software/lightning/ - GNU Lightning (universal) and https://developer.mozilla.org/En/Nanojit - Nanojit, which is used in Firefox JavaScript JIT engines. More powerful and modern JIT is LLVM, you just need to translate BF code into LLVM IR, and then LLVM can do optimisations and code generation for many platforms, or run LLVM IR on interpreter (virtual machine) with JIT capabilities. There is a post about BF & LLVM with complete LLVM JIT compiler for BF http://www.remcobloemen.nl/2010/02/brainfuck-using-llvm/
Another BF +LLVM compiler is here, in the svn of LLVM: https://llvm.org/svn/llvm-project/llvm/trunk/examples/BrainF/BrainF.cpp