Uploading binary wheels to PyPI from Github Actions

Nov 03, 2020 #devops #programming #python

If you write software using Python and your code has third-party dependencies, you've probably heard of "wheels" before.

Wheels are the current standard of distributing Python packages and are intended to replace eggs. Wheels are part of what makes the magic happen when you pip install something.

Background

Let's say you want to install Flask to build a web application. One of the steps fairly early on in your development cycle is going to be pip install Flask. When you run pip install Flask, pip queries the Python Packages Index (more commonly known as PyPI) and fetches the Flask package to install it for you locally.

The file format that pip downloads is what we're going to talk about in this post. Historically, this has been eggs (because you know, snakes lay eggs). These days though, this format tends to be either a source distribution (which is basically the raw source code of the package) or a "wheel".

If you'd like to get a deeper understanding of what wheels actually look like at a file-level, I would recommend reading PEP 427. In this post though, we'll focus on how to build wheels. Specifically, how to build binary wheels on Linux, macOS, and Windows for Python packages that contain C extensions.

The Problem

I maintain a package called streaming-form-data which allows you to parse multipart/form-data encoded byte chunks. Due to the nature of this package, the performance-critical parts were written using Cython. So whenever I wanted to create a new release on PyPI, I would compile the Cython code to a C extension and include that in the final distribution.

Long story short, this package contains a C extension.

So far, I had just been uploading source distributions to PyPI. This meant that the end-user was responsible for ensuring that they have a working C compiler (something like GCC) on their system. It mostly worked, until recently when someone reported that they found it a hassle to install it on Windows because the compiler they were using had some issues with the generated C code (or the other way around, one could argue).

At that point I thought about looking into generating platform-specific binary wheels for this package and uploading those to PyPI alongside the source distribution. This meant generating wheels for Linux, macOS, and Windows which users could install directly on their machines without having to compile anything. The advantage for users was huge, in that they didn't have to worry about system-level dependencies at all.

Building platform wheels

This is where things got slightly tricky. To generate a wheel for a given platform, you need to have access to that platform. I run Linux on my private laptop, which meant I could only generate binary wheels which would work on Linux boxes but not on macOS or Windows. In theory, I guess there might have been a possibility of generating these wheels for other platforms using virtual machines. But that was a huge "might".

Luckily, I discovered that Github Actions actually supports macOS and Windows runners, which made the whole thing so much more convenient.

I was already using Github Actions for continuous testing. So building platform wheels was a matter of adding another workflow to the CI setup.

Roadblock #2

My first attempt was to build the platform wheels by using the wheel package. If you have wheel installed, you'll have a bdist_wheel command available under python setup.py. Going that route, the Actions workflow definition looked something like the following:

build_wheels:
  runs-on: ${{ matrix.os }}
  strategy:
    matrix:
      os: [ubuntu-latest, macos-latest, windows-latest]
  steps:
    - uses: actions/checkout@v2

    - uses: actions/setup-python@v2
      with:
        python-version: 3.8

    - name: Setup pip
      run: |
         python -m pip install --upgrade pip
         python -m pip install wheel

    - name: Build wheel
      run: python setup.py bdist_wheel

    - uses: actions/upload-artifact@v2
      with:
        path: ./dist/*.whl

This would've been it, except that it only worked for building wheels on macOS and Windows, but failed on Linux. The reason was that building binary wheels on Linux is further complicated by the fact that there are hundreds (if not thousands) of Linux distributions in existence, each treating system packages differently.

There's the manylinux project from the PyPA that aims to make this easier. But running it directly still feels (to me) a bit of a hassle.

Add to this the fact that building binary wheels would have to take into account multiple CPU architectures, building packages for different Python versions across multiple operating systems. The solution described above would've required much more work to be "production ready".

The Final Solution

Luckily, I came across cibuildwheel.

cibuildwheel is an amazing project that helps with exactly this problem. It's a Python package that's meant to run on your CI server (Github Actions in this case) and abstracts away all the details of building binary wheels for multiple platforms for different CPU architectures.

So instead of relying on python setup.py bdist_wheel, you use cibuildwheel instead to build the wheels. Since this looked really promising, I modified the "Build wheel" section in the previous snippet to look like the following:

build_wheels:
  runs-on: ${{ matrix.os }}
  strategy:
    matrix:
      os: [ubuntu-latest, macos-latest, windows-latest]
  steps:
    - uses: actions/checkout@v2

    - uses: actions/setup-python@v2
      with:
        python-version: 3.8

    - name: Setup pip
      run: |
         python -m pip install --upgrade pip
         python -m pip install cibuildwheel==1.6.4

    - name: Build wheel
      run: python -m cibuildwheel --output-dir dist/
      env:
        CIBW_BUILD: cp36-* cp37-* cp38-*

    - uses: actions/upload-artifact@v2
      with:
        path: ./dist/*.whl

So basically just replacing the python setup.py bdist_wheel step with cibuildwheel, which made the workflow go through without a hitch. And at the end of the pipeline, there were 12 binary wheels available for me to upload to PyPI.

Always love it when software "just works". 😅

The only modification I had to make was specifying a value for the CIBW_BUILD environment variable to restrict the Python versions. For this specific case, I limited the Python versions to 3.6, 3.7, and 3.8. Other than that though, I left everything in cibuildwheel to their default values.

Conclusion

Python being an interpreted language, if any part of your source code needs compilation, things get slightly tricky if you want others to be able to install your packages easily.

Binary wheels are a huge step in that direction, but the underlying complexity of the problem they're trying to solve means that the user experience of building wheels for different platforms is often not 100% smooth.

IMHO though, cibuildwheel eliminates much of that complexity for you, the packager. The solution I described earlier works pretty well for me, and I would highly recommend using it in your own projects if there's the need.