If you write software using Python and your code has third-party dependencies, you've probably heard of "wheels" before.
Wheels are the current standard of distributing Python packages and are intended
to replace eggs. Wheels are part of what makes the magic happen when you pip install
something.
Background
Let's say you want to install Flask to build a web application. One of the steps
fairly early on in your development cycle is going to be pip install Flask
.
When you run pip install Flask
, pip
queries the Python Packages Index (more
commonly known as PyPI) and fetches the Flask package to install it for you
locally.
The file format that pip
downloads is what we're going to talk about in this
post. Historically, this has been eggs (because you know, snakes lay eggs).
These days though, this format tends to be either a source distribution (which
is basically the raw source code of the package) or a "wheel".
If you'd like to get a deeper understanding of what wheels actually look like at a file-level, I would recommend reading PEP 427. In this post though, we'll focus on how to build wheels. Specifically, how to build binary wheels on Linux, macOS, and Windows for Python packages that contain C extensions.
The Problem
I maintain a package called streaming-form-data which allows you to parse
multipart/form-data
encoded byte chunks. Due to the nature of this package,
the performance-critical parts were written using Cython. So whenever I wanted
to create a new release on PyPI, I would compile the Cython code to a C
extension and include that in the final distribution.
Long story short, this package contains a C extension.
So far, I had just been uploading source distributions to PyPI. This meant that the end-user was responsible for ensuring that they have a working C compiler (something like GCC) on their system. It mostly worked, until recently when someone reported that they found it a hassle to install it on Windows because the compiler they were using had some issues with the generated C code (or the other way around, one could argue).
At that point I thought about looking into generating platform-specific binary wheels for this package and uploading those to PyPI alongside the source distribution. This meant generating wheels for Linux, macOS, and Windows which users could install directly on their machines without having to compile anything. The advantage for users was huge, in that they didn't have to worry about system-level dependencies at all.
Building platform wheels
This is where things got slightly tricky. To generate a wheel for a given platform, you need to have access to that platform. I run Linux on my private laptop, which meant I could only generate binary wheels which would work on Linux boxes but not on macOS or Windows. In theory, I guess there might have been a possibility of generating these wheels for other platforms using virtual machines. But that was a huge "might".
Luckily, I discovered that Github Actions actually supports macOS and Windows runners, which made the whole thing so much more convenient.
I was already using Github Actions for continuous testing. So building platform wheels was a matter of adding another workflow to the CI setup.
Roadblock #2
My first attempt was to build the platform wheels by using the wheel package.
If you have wheel
installed, you'll have a bdist_wheel
command available
under python setup.py
. Going that route, the Actions workflow definition
looked something like the following:
build_wheels:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Setup pip
run: |
python -m pip install --upgrade pip
python -m pip install wheel
- name: Build wheel
run: python setup.py bdist_wheel
- uses: actions/upload-artifact@v2
with:
path: ./dist/*.whl
This would've been it, except that it only worked for building wheels on macOS and Windows, but failed on Linux. The reason was that building binary wheels on Linux is further complicated by the fact that there are hundreds (if not thousands) of Linux distributions in existence, each treating system packages differently.
There's the manylinux project from the PyPA that aims to make this easier. But running it directly still feels (to me) a bit of a hassle.
Add to this the fact that building binary wheels would have to take into account multiple CPU architectures, building packages for different Python versions across multiple operating systems. The solution described above would've required much more work to be "production ready".
The Final Solution
Luckily, I came across cibuildwheel.
cibuildwheel
is an amazing project that helps with exactly this problem. It's
a Python package that's meant to run on your CI server (Github Actions in this
case) and abstracts away all the details of building binary wheels for multiple
platforms for different CPU architectures.
So instead of relying on python setup.py bdist_wheel
, you use cibuildwheel
instead to build the wheels. Since this looked really promising, I modified the
"Build wheel" section in the previous snippet to look like the following:
build_wheels:
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ubuntu-latest, macos-latest, windows-latest]
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: 3.8
- name: Setup pip
run: |
python -m pip install --upgrade pip
python -m pip install cibuildwheel==1.6.4
- name: Build wheel
run: python -m cibuildwheel --output-dir dist/
env:
CIBW_BUILD: cp36-* cp37-* cp38-*
- uses: actions/upload-artifact@v2
with:
path: ./dist/*.whl
So basically just replacing the python setup.py bdist_wheel
step with
cibuildwheel
, which made the workflow go through without a hitch. And at the
end of the pipeline, there were 12 binary wheels available for me to upload to
PyPI.
Always love it when software "just works". 😅
The only modification I had to make was specifying a value for the CIBW_BUILD
environment variable to restrict the Python versions. For this specific case, I
limited the Python versions to 3.6, 3.7, and 3.8. Other than that though, I left
everything in cibuildwheel
to their default values.
Conclusion
Python being an interpreted language, if any part of your source code needs compilation, things get slightly tricky if you want others to be able to install your packages easily.
Binary wheels are a huge step in that direction, but the underlying complexity of the problem they're trying to solve means that the user experience of building wheels for different platforms is often not 100% smooth.
IMHO though, cibuildwheel
eliminates much of that complexity for you, the
packager. The solution I described earlier works pretty well for me, and I would
highly recommend using it in your own projects if there's the need.