PyPy: What is all the buzz about?

sub picture sub · Jun 4, 2010 · Viewed 7.2k times · Source

Note: The title is deliberately provocative (to make you click on it and want to close-vote the question) and I don't want to look preoccupied.

I've been reading and hearing more and more about PyPy. It's like a linear graph.

  • Why is PyPy so special? As far as I know implementations of dynamic languages written in the languages itself aren't such a rare thing, or am I not getting something?

  • Some people even call PyPy "the future" [of python], or see some sort of deep potential in this implementation. What exactly is the meaning of this?

Answer

user137673 picture user137673 · Jul 24, 2010

Good thing to be aware when talking about the PyPy project is that it aims to actually provide two deliverables: first is JIT compiler generator. Yes, generator, meaning that they are implementing a framework for writing implementations of highly dynamic programming languages, such as Python. The second one is the actual test of this framework, and is the PyPy Python interpreter implementation.

Now, there are multiple answers why PyPy is so special: the project development is running from 2004, started as a research project rather than from a company, reimplements Python in Python, implements a JIT compiler in Python, and can translate RPython (Python code with some limitations for the framework to be able to translate that code to C) to compiled binary.

The current version of PyPy is 99% compatible with CPython version 2.5, and can run Django, Twisted and many other Python programs. There used to be a limitation of not being able to run existing CPython C extensions, but that is also being addressed with cpyext module in PyPy. C API compatibility is possible and to some extent already implemented. JIT is also very real, see this pystone comparison.

With CPython:

Python 2.5.5 (r255:77872, Apr 21 2010, 08:44:16) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from test import pystone
>>> pystone.main(1000000)
Pystone(1.1) time for 1000000 passes = 12.28
This machine benchmarks at 81433.2 pystones/second

With PyPy:

Python 2.5.2 (75632, Jun 28 2010, 14:03:25)
[PyPy 1.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
And now for something completely different: ``A radioactive cat has 18
half-lives.''
>>>> from test import pystone
>>>> pystone.main(1000000)
Pystone(1.1) time for 1000000 passes = 1.50009
This machine benchmarks at 666625 pystones/second

So you can get a nearly 10x speedup just by using PyPy on some calculations!

So, as PyPy project is slowly maturing and offering some advantages, it is attracting more interest from people trying to address speed issues in their code. An alternative to PyPy is unladden swallow (a Google project) which aims to speed up CPython implementation by using LLVM's JIT capabilities, but progress on unladden swallow was slowed because the developer needed to deal with bugs in LLVM.

So, to sum it up, I guess PyPy is regarded the future of Python because it's separating language specification from VM implementation. Features introduced in, eg. stackless Python, could then be implemented in PyPy with very little extra effort, because it's just altering language specification and you keep the shared code the same. Less code, less bugs, less merging, less effort.

And by writing, for example, a new bash shell implementation in RPython, you could get a JIT compiler for free and speed up many linux shell scripts without actually learning any heavy JIT knowledge.