How to write Python 2.x as much compatible with Python 3.x as possible?

Tadeck picture Tadeck · Dec 14, 2011 · Viewed 7.2k times · Source

There are many ways to include Python 3.x features in Python 2.x, so code of Python 2.x scripts could be easily converted into Python 3.x in the future. One of these examples is replacing print statement with print() function:

>>> from __future__ import print_function

Is there any list or resource that could give one some ideas how to make Python 2.x code as close to Python 3.x as possible?

Could you give examples of other useful imports or definitions that can make Python 2.x look and behave more like Python 3.x?

Lets assume we have the latest Python 2.x (2.7.2 at the moment, I believe) at our disposal.

Answer

dstromberg picture dstromberg · Dec 14, 2011

I'm putting the finishing touches on an approximately 5000 line, deduplicating backup program (http://stromberg.dnsalias.org/~strombrg/backshift/) that runs on CPython 2.[567], CPython 3.[0123] (3.3 is still alpha 0), Pypy 1.7 and Jython trunk. I also tried IronPython, but it was a pretty different thing - it had no standard library, so no backshift love. Oh, and it can use Cython for its innermost loop, or psyco - but pypy is faster than either, especially on 32 bit systems.

Anyway, I found that to write code that runs equally well on 2.x and 3.x all I needed to do was:

1) print(variable) works the same on both 2.x and 3.x. print(variable1, variable2) does not. To 2.x, print(variable) says "evaluate this parenthesized expression, and print the single result using the print statement". To 3.x, print(variable) says "call the print function on this single result. So print('abc %d %d' % (1, 2)) works fine in both, because it's a single-valued result, and both grok the % operator for string formatting.

2) Avoid octal constants. Instead of writing 0755, write (7*64 + 5*8 + 5).

3) To do binary I/O in either, I used my bufsock module. http://stromberg.dnsalias.org/~strombrg/bufsock.html I'd os.open a file, and wrap it with bufsock (or use the rawio class in the module). On 2.x, this would return a string of bytes encoded as 8 bit character strings. On 3.x, this would return a bytes object, which acts much like a list of small integers. Then I'd just pass around one or the other, testing with "isinstance(foo, str)" as needed to distinguish between the two. I did this, because to a backup program, bytes are bytes - I didn't want to mess around with encodings fouling up saving data reliably, and not all encodings round trip well.

4) When doing exceptions, avoid the "as" keyword. Instead, use EG:

  try:
     self.update_timestamp()
  except (OSError, IOError):
     dummy, utime_extra, dummy = sys.exc_info()
     if utime_extra.errno == errno.ENOENT:

5) A bunch of modules were renamed in the transition from 2.x to 3.x. So try importing either one into an otherwise-empty module, with something like:

try:
   from anydbm import *
except ImportError:
   from dbm import *

...this would appear in a module by itself, with a name EG adbm.py. Then anytime I needed a key-value store, I'd import adbm instead of the two different things needed for 2.x or 3.x directly. Then I'd pylint everything but that stubby module, adbm.py - and things like it that pylint disliked. The idea was to pylint everything possible, with exceptions to the "everything's gotta pylint" rule in a tiny module all by itself, one exception per module.

6) It helps a lot to set up automatic unit tests and system tests that run on 2.x and 3.x, and then test frequently on at least one 2.x interpreter as well as at least one 3.x interpreter. I also run pylint against my code often, albeit only a pylint that checked for 2.5.x compliance - I started the project before pylint got 3.x support.

7) I set up a small "python2x3" module that has a few constants and callables to make life easier: http://stromberg.dnsalias.org/svn/python2x3/trunk/python2x3.py

8) b'' literals don't work in 2.5, though they sort of work in 2.[67]. Instead of trying to preprocess or something, I set up a constants_mod.py that had lots of things that would normally be b'' literals in 3.x, and converted them from a simple string to whatever the "bytes" type is for 2.x or 3.x. So they're converted once on module import, not over and over at runtime. If you're targeting 2.[67] and up, there's perhaps a better way, but when I started the Pypy project was only compatible with 2.5, and Jython still is.

9) In 2.x, long integers have an L suffix. In 3.x, all integers are long. So I just went with avoiding long integer constants as much as possible; 2.x will promote an integer to long as necessary, so this seems to work out fine for most things.

10) It helps a LOT to have a bunch of python interpreters around to test with. I built 2.[567] and 3.[0123] and stashed them in /usr/local/cpython-x.y/ for easy testing. I also put some Pypy's and Jython's in /usr/local, again for easy testing. Having a script to automate the CPython builds was rather valuable.

I believe these were all the contortions I required to get a highly portable python codebase in a nontrivial project. The one big omission to the list I've written above, is that I'm not trying to use unicode objects - that's something someone else is probably better qualified to comment on.

HTH