Skip to content

Python

On BriCS supercomputers, Python and Python packages can be installed and managed using tools like pip and conda. The recommended approach is to use Conda through Miniforge, which provides a lightweight and flexible solution for managing environments and packages, including non-Python dependencies. Virtual environments such as Conda environments and Python venv are recommended for isolating dependencies regardless of your chosen Python package management method.

Conda: Installing and Using Miniforge

Conda is a general purpose multi-platform package manager that installs and manages software packages.

conda-forge

Our recommended installation method is to install Conda Miniforge. This is due to the need of a license to use the mainline anaconda channel.

To install the latest version of Conda:

$ cd $HOME
$ curl --location --remote-name "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
$ bash Miniforge3-$(uname)-$(uname -m).sh
$ rm Miniforge3-$(uname)-$(uname -m).sh

We advise you don't initialise the shell with conda init to avoid complications associated with modifying shell startup scripts. Instead we advise the use of the activate script provided with Miniforge (see below) to perform shell initialisation as needed.

then to activate:

$ source ~/miniforge3/bin/activate

You should not install packages in your base environment, instead create separate environments to manage your software packages.

E.g. to create an environment named test with an installation of Python 3.10:

(base) $ conda create -n test python=3.10 
(base) $ conda activate test

You can then install further packages in your test environment using conda install. E.g. to install the Python package scipy:

(test) $ conda install scipy

It can be useful to specify the packages in a Conda environment in a Conda environment YAML file. This allows creation of a Conda environment using a single command:

$ conda env create -f environment.yml

You can list the installed packages in your environment using conda list and you can deactivate your environment using conda deactivate.

Finding aarch64 compatible Conda packages

Since Isambard clusters are mainly based on Linux Arm64 architecture (aarch64), it is important to find packages built for this architecture. To find these packages, you can search the Anaconda.org website and filter the platform to linux-aarch64.

Screenshot of Anaconda.org website showing search results and a Platform filtering menu

Installing Python Packages

We recommended you use Conda environments to install and manage your Python packages as above.

Alternatively, Cray Python is available as a pre-installed module. It can be accessed as follows:

$ module avail # list available modules
...
$ module load cray-python
$ which python3
/opt/cray/pe/python/3.11.5/bin/python3
$ python3
Python 3.11.5 (main, Nov 29 2023, 20:19:53) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> exit()

Using Virtual Environments

When using Python we recommend working inside virtual environments using the Python module venv. Virtual environments isolate your pip installed dependencies for each unique project.

Let's create a venv:

$ mkdir ~/.virtualenvs/ # Create folder for virtual environments
$ python3 -m venv --upgrade-deps ~/.virtualenvs/test2

To activate our test2 environment and install the package scipy:

$ source ~/.virtualenvs/test2/bin/activate
(test2) $ which python3
$HOME/.virtualenvs/test2/bin/python3
(test2) $ python3 -m pip install scipy

To list the installed packages in your environment:

(test2) $ python3 -m pip list

pip vs python3 -m pip

Note the use of the module approach when using pip, i.e. python3 -m pip, rather than pip. This helps avoid confusion around which Python installation/virtual environment pip is acting on. You can exit your environment using deactivate.

Installing Python Packages for Arm 64 on Linux (aarch64)

Isambard supercomputers mainly use the Arm 64 CPU architecture (see Specifications). Many python packages don't build wheels (precompiled binaries) for aarch64. If your package isn't available through conda or a container, this would necessitate building the package from source. This section will help you with understanding the usual Python build systems.

Machine Learning Packages 🤗

Support for aarch64 by popular machine learning packages is listed in our applications page.

Compilers

The default compilers on Isambard supercomputers are older versions which form part of the underlying operating system build toolchain, and are typically unsuitable for building research software. It is recommended to set the CC and CXX environment variables to ensure pip uses modern C and C++ compilers. Often this can allow a package to be installed using pip on Arm.

In the following example the available GCC C++ compilers are listed and the CC and CXX variables are used to specify a specific version to use for installing a python package:

user@nid001040> ls /usr/bin/g++*
/usr/bin/g++  /usr/bin/g++-12  /usr/bin/g++-13  /usr/bin/g++-7
user@nid001040> CC=/usr/bin/gcc-12 CXX=/usr/bin/g++-12 pip install <PACKAGE_NAME>

setup.py and pyproject.toml

Looking through a package's setup.py (or pyproject.toml) should be the first port of call. The setup.py file is the traditional build script for Python packages. Check for conditional logic based on sys.platform or platform. For example:

import sys
import platform
if sys.platform == "linux" and platform.uname().machine == "aarch64":
    extra_compile_args = ["-march=armv8-a"]

Modern Python packages often use pyproject.toml instead of setup.py. It is important to review that the [build-system] section specifies a compatible build backend like setuptools or poetry.

[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

Understanding sys.platform and platform()

When building Python packages, the system's platform and architecture are often used to determine compatibility. Python provides tools like sys.platform and platform() for this purpose:

  • sys.platform: A string that identifies the operating system. Common values include:

    • "linux" for Linux systems.
    • "darwin" for macOS.
    • "win32" for Windows.
  • platform.uname(): Provides detailed system information, including the machine architecture (e.g., x86_64, aarch64).

For example, here is the result of running these on Isambard-AI:

>>> import sys
>>> sys.platform
'linux'
>>> import platform
>>> platform.uname()
uname_result(system='Linux', node='nid001041', release='5.14.21-150500.55.31_13.0.53-cray_shasta_c_64k', 
version='#1 SMP Mon Dec 4 22:56:47 UTC 2023 (03d3f83)', machine='aarch64')
>>> platform.uname().machine
'aarch64'

This is particularly useful when inspecting a package's setup.py or pyproject.toml to ensure compatibility with aarch64.

Dependencies

In setup.py look for install_requires() or requirements.txt references. Ensure all dependencies are available for aarch64.

While for pyproject.toml check for dependencies under [project.dependencies] [tool.poetry.dependencies] or equivalent sections.

CI/CD or Github actions

In a Python package's repository, GitHub Actions workflows often indicate which platforms and architectures the package is built for. You will see wheels are often packaged with a name in the form

{PACKAGE_NAME}-linux_x86_64.whl

Look for .github/workflows in the repository. For example:

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: "3.8"
      - name: Build wheel
        run: python3 setup.py bdist_wheel

This can help you determine if aarch64 is supported or if modifications are needed.

Useful resources: