A containerized toolchain to validate a seed set of Python wheels and build the full transitive dependency closure as native s390x wheels from source. It is designed for environments where prebuilt manylinux wheels are unavailable for IBM Z (s390x), and you need a reproducible way to assemble a complete, architecture-correct wheelhouse.
This project does not “convert” wheels between architectures. Instead, it verifies your seed wheel(s), inspects declared dependencies, and builds those dependencies from source as s390x wheels. It handles complex packages (notably PyArrow) by compiling required native libraries inside the container.
What You Get
- Validated wheelhouse: Every input wheel is installed in a throwaway target to check basic import/install viability without dependencies.
- Full dependency closure: The script reads
Requires-Distmetadata and breadth‑first builds all direct and transitive dependencies as s390x wheels. - Native builds: Uses
pip wheel --no-binary :all:to force source builds for correct s390x artifacts. - PyArrow support: Compiles Apache Arrow C++ once, then builds PyArrow against it.
- Detailed logs: Per‑package build and validation logs for traceability.
The workflow runs inside a purpose-built Ubuntu container and is orchestrated by build-wheels.sh. At a high level:
-
The
Dockerfile.s390xprovisions a minimal s390x build image:- Tooling:
build-essential,gcc/g++/gfortran,cmake,ninja,autoconf/automake/libtool,pkg-config. - Common native deps:
libopenblas-dev,liblapack-dev,libssl-dev,liblz4-dev,libsnappy-dev,libzstd-dev,zlib1g-dev,libbz2-dev. - Python toolchain:
python3,python3-dev,python3-venv, then upgradespip,setuptools,wheel,Cythonin a venv at/venv. - Installs
build-wheels.shat/usr/local/bin/build-wheels.shand sets the entrypoint to/bin/bash.
- Tooling:
-
The
build-wheels.shscript runs in four main phases:- ENV CHECK: Prints Python/pip versions and ensures Python build tooling (
pip,setuptools,wheel,packaging) are present. - DISCOVERY: Finds
*.whlin the working directory (/workby default) — these are the “seed” wheels you care about. - STRUCTURAL VALIDATION: For each seed wheel, installs it with
--no-depsinto a temporary directory to catch obvious packaging/import issues. If the install succeeds, the wheel is copied into the validated wheelhouse. - DEPENDENCY RESOLUTION: Reads
Requires-Distfrom the wheel’sMETADATA(evaluating environment markers) and executes a BFS walk to build all direct and transitive dependencies from source as s390x wheels.
- ENV CHECK: Prints Python/pip versions and ensures Python build tooling (
-
Artifact detection and idempotency:
- Produced wheels are detected by diffing the wheelhouse directory before/after each build step, with a fallback that parses
pip wheellogs (Saved ... .whl). - The script canonicalizes project names using
packagingand recognizes both-and_in filenames, so it won’t rebuild packages already present.
- Produced wheels are detected by diffing the wheelhouse directory before/after each build step, with a fallback that parses
-
Special‑case for PyArrow:
- PyArrow requires Arrow C++ libraries. The script compiles Arrow C++ once (using CMake with
-DARROW_*feature flags) and setsCMAKE_PREFIX_PATH/Arrow_DIRso PyArrow links against the local install. - It exports
PYARROW_BUNDLE_ARROW_CPP=0to avoid bundling Arrow C++ and keeps build logs under/work/build_logs.
- PyArrow requires Arrow C++ libraries. The script compiles Arrow C++ once (using CMake with
- Validates given wheels and builds all dependencies from source into a single wheelhouse directory.
- Targets s390x by building natively within an s390x container base image.
- Focuses on correctness and reproducibility rather than speed. Heavy packages (e.g., Arrow C++) will take significant time and RAM/CPU.
- Does not modify, retag, or “convert” foreign‑arch wheels. If a wheel is for another architecture, it is not repackaged; instead, dependencies are compiled natively.
- Base image:
mirror.gcr.io/library/ubuntu:24.04 - Build tools:
build-essential, compilers, CMake, Ninja, autotools,pkg-config - Common numeric/compression libs: OpenBLAS/LAPACK, SSL, Snappy, LZ4, Zstandard, zlib, bzip2
- Python: system
python3, plus venv at/venvwith up‑to‑datepip,setuptools,wheel,Cython - Entrypoint:
/bin/bash - Script:
/usr/local/bin/build-wheels.sh
build-wheels.sh is organized into reusable helpers and phases:
-
Environment and tooling
- Activates the venv and ensures
pip,setuptools,wheel, andpackagingare installed. - Configurable via env vars
PY_BINandPIP_BIN(default:/venv/bin/python3,/venv/bin/pip).
- Activates the venv and ensures
-
Name handling
- Uses
packaging.requirements.Requirementandpackaging.utils.canonicalize_nameto derive a project’s canonical base name from a PEP 508 spec. This prevents unnecessary rebuilds (e.g., it recognizes thatnumpy>=1.20andNumPyare the same project). - Detects already‑built wheels with both hyphen and underscore forms in filenames.
- Uses
-
Validation
validate_wheel_no_deps: Installs a wheel into a temporary directory with--no-deps. On success, the wheel is copied to the output wheelhouse.
-
Metadata parsing
requires_from_wheel: Reads the wheel’sMETADATAfrom the.dist-info/directory and prints normalizedRequires-Distlines, respecting environment markers for the current interpreter/platform.
-
Building
build_spec_with_deps: Runspip wheel --no-binary :all:to force a source build of a PEP 508 spec into the wheelhouse. For PyArrow it precompiles Arrow C++ and sets the required environment variables so PyArrow links against it.- Tracks produced artifacts by diffing the wheelhouse state before/after; falls back to parsing
pip’s “Saved … .whl” lines when needed.
-
Dependency closure (BFS)
build_dependency_closure: Seeds the queue with all direct dependencies of the validated wheel. Each produced wheel’s dependencies are enqueued, skipping anything already present or previously processed, until the closure is exhausted.
-
Logging
- Pretty, timestamped logging with
ENV CHECK,PHASE 1–4headers. - Writes detailed logs per build under
/work/build_logs, includingarrow_cmake.out,arrow_make.out, andarrow_install.outfor the Arrow build.
- Pretty, timestamped logging with
- Build the s390x image:
docker build -f Dockerfile.s390x -t s390x-wheelhouse:latest .-
Prepare your seed wheels directory on the host (these are the primary wheels you want to validate and for which you want dependencies built). For example, place
my_package-1.0.0-py3-none-any.whlinto./wheels. -
Run the container, mounting:
- Your seed wheels directory →
/work - An output directory →
/validated_wheels - A logs directory →
/work/build_logs
- Your seed wheels directory →
mkdir -p wheelhouse build_logs
docker run --rm \
-v "$PWD/wheels:/work" \
-v "$PWD/wheelhouse:/validated_wheels" \
-v "$PWD/build_logs:/work/build_logs" \
s390x-wheelhouse:latest \
/usr/local/bin/build-wheels.sh- Results:
- Validated seed wheels and all s390x dependency wheels land in
./wheelhouse. - Build/validation logs appear in
./build_logs.
- Validated seed wheels and all s390x dependency wheels land in
-
Positional arguments:
build-wheels.sh [VALIDATED_DIR] [LOG_DIR]- Defaults:
VALIDATED_DIR=/validated_wheels,LOG_DIR=/work/build_logs
-
Environment variables:
PY_BIN: Python interpreter (default/venv/bin/python3)PIP_BIN: pip (default/venv/bin/pip)- Standard build variables like
CFLAGS,LDFLAGS,CMAKE_PREFIX_PATHmay help for custom/advanced native builds.
-
Building PyArrow explicitly:
- PyArrow is handled automatically when it appears in the dependency graph. If you want to seed with PyArrow directly, place a PyArrow wheel (or seed a dependent wheel) in
/work. The script compiles Arrow C++ if not already present and then builds PyArrow.
- PyArrow is handled automatically when it appears in the dependency graph. If you want to seed with PyArrow directly, place a PyArrow wheel (or seed a dependent wheel) in
- Heavy native builds: Arrow C++, NumPy/SciPy (with OpenBLAS/LAPACK), and similar packages can take significant time and resources. Ensure the host has sufficient CPU/RAM and consider constraining container resources appropriately.
- Extra system deps: Some packages may need additional
aptlibraries not preinstalled here. ExtendDockerfile.s390xto add those if you encounter missing headers/libraries. - Idempotent runs: Re-running with the same wheelhouse will skip already-present base packages (thanks to canonical name matching and filename normalization).
- Environment markers: Dependencies are evaluated against the running environment (e.g., Python version). Changing the base Python may change resolved dependencies.
- Output wheelhouse:
/validated_wheels(mount a host directory here to persist results). - Logs directory:
/work/build_logs- Per-package logs such as
build_<project>.logand validation logsvalidate_<wheel>.log. - Arrow C++ build logs:
arrow_cmake.out,arrow_make.out,arrow_install.out.
- Per-package logs such as
- Summary: At the end, the script prints a summary with total wheels present and points to the logs directory.
-
Build fails for a dependency
- Check
build_<project>.login/work/build_logs. - Look for missing system headers/libs; extend
Dockerfile.s390xtoapt installwhat’s needed.
- Check
-
No wheels produced for a package you expected
- The script logs
[have]when it detects that the wheelhouse already contains that base package. - If still unclear, examine the
Saved ... .whllines parsed from the pip log and verify the wheelhouse diff logic.
- The script logs
-
PyArrow build errors
- Inspect
arrow_*logs. Some Arrow features may require additional system libraries. Adjust CMake flags inbuild-wheels.shor add libs in the Dockerfile.
- Inspect
Dockerfile.s390x— s390x Ubuntu 24.04 base with toolchains and Python venv; installs the build script.build-wheels.sh— orchestrates validation, dependency parsing, native builds, logging, and artifact detection.LICENSE— project license.
See LICENSE for details.