PyZMQ, Python2.5, and Python3#
This describes early days of pyzmq development, when we supported Python 2.5 and 3.1. Much of this information is wildly outdated now.
PyZMQ is a fairly light, low-level library, so supporting as many versions as is reasonable is our goal. Currently, we support at least Python 2.5-3.1. Making the changes to the codebase required a few tricks, which are documented here for future reference, either by us or by other developers looking to support several versions of Python.
It is far simpler to support 2.6-3.x than to include 2.5. Many of the significant syntax changes have been backported to 2.6, so just writing new-style code would work in many cases. I will try to note these points as they come up.
Many functions we use, primarily involved in converting between C-buffers and Python
objects, are not available on all supported versions of Python. In order to resolve
missing symbols, we added a header
utils/pyversion_compat.h that defines missing
symbols with macros. Some of these macros alias new names to old functions (e.g.
PyBytes_AsString), so that we can call new-style functions on older versions, and some
simply define the function as an empty exception raiser. The important thing is that the
symbols are defined to prevent compiler warnings and linking errors. Everywhere we use
C-API functions that may not be available in a supported version, at the top of the file
is the code:
cdef extern from "pyversion_compat.h": pass
This ensures that the symbols are defined in the Cython generated C-code. Higher level switching logic exists in the code itself, to prevent actually calling unavailable functions, but the symbols must still be defined.
Bytes and Strings#
If you are using Python >= 2.6, to prepare your PyZMQ code for Python3 you should use
b'message' syntax to ensure all your string literal messages will still be
bytes after you make the upgrade.
The most cumbersome part of PyZMQ compatibility from a user’s perspective is the fact
that, since ØMQ uses C-strings, and would like to do so without copying, we must use the
bytes object, which is backported to 2.6. In order to do this in a
Python-version independent way, we added a small utility that unambiguously defines the
basestring. This is important,
str means different things on 2.x and 3.x, and
undefined on 2.5, and both
basestring are undefined on 3.x.
All typechecking in PyZMQ is done against these types:
Where we really noticed the issue of
strings coming up for
users was in updating the tests to run on every version. Since the
b'bytes literal' syntax was not backported to 2.5, we must call
every string in the test suite.
Unicode discussion for more information on strings/bytes.
The standard C-API function for turning a C-string into a Python string was a set of
functions with the prefix
PyString_*. However, with the Unicode changes made in
Python3, this was broken into
PyBytes_* for bytes objects and
unicode objects. We changed all our
PyString_* code to
PyBytes_*, which was
backported to 2.6.
Since Python 2.5 doesn’t support the
PyBytes_* functions, we had to alias them to
PyString_* methods in utils/pyversion_compat.h.
#define PyBytes_FromStringAndSize PyString_FromStringAndSize #define PyBytes_FromString PyString_FromString #define PyBytes_AsString PyString_AsString #define PyBytes_Size PyString_Size
The layer that is most complicated for developers, but shouldn’t trouble users, is the Python C-Buffer APIs. These are the methods for converting between Python objects and C buffers. The reason it is complicated is that it keeps changing.
There are two buffer interfaces for converting an object to a C-buffer, known as new-style
and old-style. Old-style buffers were introduced long ago, but the new-style is only
backported to 2.6. The old-style buffer interface is not available in 3.x. There is also
an old- and new-style interface for creating Python objects that view C-memory. The
old-style object is called a
buffer, and the new-style object is
memoryview. Unlike the new-style buffer interface for objects,
memoryview has only been backported to 2.7. This means that the available
buffer-related functions are not the same in any two versions of Python 2.5, 2.6, 2.7, or
We have a
utils/buffers.pxd file that defines our
utils/buffers.pxd was adapted from mpi4py’s
frombuffer() functionality was added. These functions
internally switch based on Python version to call the appropriate C-API functions.
str is not a platform independent type. The two places where we are
required to return native str objects are
Message.__str__(). In both of these cases, the natural return is actually a
bytes object. In the methods, the native
str type is checked, and if the
native str is actually unicode, then we decode the bytes into unicode:
# ... b = natural_result() if str is unicode: return b.decode() else: return b
This section is only relevant for supporting Python 2.5 and 3.x, not for 2.6-3.x.
The syntax for handling exceptions has changed in Python 3. The old syntax:
try: s.send(msg) except zmq.ZMQError, e: handle(e)
is no longer valid in Python 3. Instead, the new syntax for this is:
try: s.send(msg) except zmq.ZMQError as e: handle(e)
This new syntax is backported to Python 2.6, but is invalid on 2.5. For 2.6-3.x compatible code, we could just use the new syntax. However, the only method we found to catch an exception for handling on both 2.5 and 3.1 is to get the exception object inside the exception block:
try: s.send(msg) except zmq.ZMQError: e = sys.exc_info() handle(e)
This is certainly not as elegant as either the old or new syntax, but it’s the only way we have found to work everywhere.