I've been meaning to write this down for a while, but the wider release of PEP404 finally spurred me into getting it done. I spent a bunch of time porting a medium sized (~2.5kLoC) Python library to Python 3, using two different approaches, and there's a few tips that might have saved me time if I'd known them in advance.
2to3
For the first pass (2fecd513a4..cc08d9b081), I followed the recommended method of running 2to3, then fixed up the Python 2 source until the Py3k generated version passed all of the unit tests.
This was mostly straightforward, but tips that might have saved me time in this process are:
- Understanding the old style versus new style import locations would have helped. But then I still don't really understand the wrinkles here.
[2fecd513a4, e92f459593] - Get in the habit of using
self.assertEqual
rather thanself.assertEquals
in unit tests. [c5688b8003] - Doctests don't work well with 2to3. It's possible to get around this by contorting the results, but that rather spoils the point of doctests being simple and clear.
[a4635065da] - I never did figure out a good way to name/mark packages for the Cheese
Shop where you've got a Python 2.x version and a generated Py3k version.
[a39f4239aa]
Dual Source
The original Python 2.x code and the generated Py3k code actually looked pretty similar, so the second pass (e334775b77..c90d133bc0) converted the codebase to a single set of source that could be used in both Python 2.x and Py3k.
This was rather trickier, and involved a few contortions. However, note that several of the contortions are because the minimum Python version I wanted to target is 2.5 – if the target had been >= 2.6, there's more to from __future__ import
.
- Unicode string literals are a pain.
u"a string"
is illegal in Py3k, but"a string"
isn't Unicode in Python 2.x. In the meanwhile, there's au("string that can include \u0101")
function (which might be slow, so the code is tuned to avoid using it at runtime).
(Would be fixed for a target of >= 2.6 withfrom __future__ import unicode_literals
.)
[e334775b77, 1897a0a30c, a49b35ead4, 59f4e2fef0,5e6f1e2447, e019a990aa, b4e6759e67, 9246f92746, b4cb1e9246, 3ff89a806c, 319c33ed67] - Needed a
prnt
function to be used instead ofprint
.
(Would be fixed for a target of >= 2.6 withfrom __future__ import print_function
.)
[c15fd9cb8d, d32f784cb4]
- Needed to use
sys.exc_info
to portably get at a caught exception.
(Would be fixed for a target of >= 2.6 becauseas
is included there.)
[3d8ddca92a] - There's no
sys.maxint
any more, and the suggested alternative
(sys.maxsize
) doesn't work on Python 2.5. So I just went for 65535.
(Would be fixed for a target of >= 2.6 assys.maxsize
is included there.)
[0964219b25] - No
L
suffix for literal longs any more, so need a wrapper to forceint
s tolong
s.
[649b698b43] - Because the library includes auto-generated Python code, the generation tools needed to allow for Python 2.x/Py3k compatibility too. The new
rpr
function (sticking with disemvowelling as a way of naming things) converts generated code to use theu()
utility.
[3ff89a806c]
The net result of all of this is a small module that acts a portability layer.