Skip to content
  • AIP1 Isambard-AI Phase 1 supported
  • AIP2 Isambard-AI Phase 2 supported
  • I3 Isambard 3 unsupported
  • BC5 BlueCrystal 5 unsupported

Alphafold

Alphafold 2

To install Alphafold 2 we recommend using ColabFold on Isambard-AI as it is the easiest way to access AlphaFold 2's protein structure prediction capabilities.

Key advantages:

  • No AlphaFold installation needed: ColabFold includes its own lightweight implementation of AlphaFold2's Jax inference pipeline, so you can use the models without installing the full AlphaFold software stack
  • Flexible MSA generation: Uses MMseqs2 for multiple sequence alignment searches, which can run against remote servers or locally for faster performance.
  • Additional control: Offers more adjustable parameters than standard AlphaFold2

The sections below will guide you through setting up ColabFold and optionally configuring local MMseqs2 databases for improved performance.

Alphafold 2 Dataset

The full Alphafold 2 dataset is available on Isambard-AI in the directory /projects/public/brics/data/bio/alphafold.

user.brics@login44:~> ls /projects/public/brics/data/bio/alphafold
bfd  mgnify  params  pdb70  pdb_mmcif  pdb_seqres  uniprot  uniref30  uniref90

Colabfold

Prerequisites

We provide options to install Colabfold with conda or as a singularity container below.

Conda

To install colabfold on Isambard-AI with Conda let's first copy the environment definition file:

colabfold-env.yaml
name: colabfold
channels:
  - conda-forge
  - bioconda
dependencies:
  - python=3.12
  - pip
  - pip:
    - jax[cuda12]
    - tensorflow
    - git+https://github.com/sokrypton/ColabFold.git#egg=colabfold[alphafold]

You can download the file here: colabfold-env.yaml

We can then create the environment on a compute node:

$ srun --gpus 4 -N 1 --pty bash
$ conda env create -f colabfold-env.yaml
$ conda activate colabfold
# Check Jax is using the GPU
(colabfold) $ python3 -c "import jax; print(jax.default_backend())"

Finally test colabfold_batch on a small P54025.fasta file:

(colabfold) $ wget https://github.com/sokrypton/ColabFold/raw/main/test-data/P54025.fasta
(colabfold) $ colabfold_batch P54025.fasta output/ # Test colabfold_batch

Singularity

To run colabfold with singularity let's first create a colabfold.def singularity container definition file and insert the following:

colabfold.def
Bootstrap: docker
From: nvcr.io/nvidia/jax:24.04-py3

%post
    python3 -m pip install tensorflow jax[cuda12]
    python3 -m pip install --no-warn-conflicts 'colabfold[alphafold] @ git+https://github.com/sokrypton/ColabFold'

%environment
    export PATH=${HOME}/.local/bin:$PATH
    unset XLA_FLAGS

%runscript
    conda deactivate >/dev/null 2>&1 || true
    exec /bin/bash -i

Click here to download the file: colabfold.def

Then build and run the container:

$ singularity pull jax2404py3.sif docker://nvcr.io/nvidia/jax:24.04-py3
$ singularity build --fakeroot colabfold.sif colabfold.def
$ srun -N 1 --gpus 4 --pty singularity run --nv colabfold.sif

Finally test colabfold_batch functionality inside the container on a small P54025.fasta file:

$ Singularity> wget https://github.com/sokrypton/ColabFold/raw/main/test-data/P54025.fasta
$ Singularity> colabfold_batch P54025.fasta output/ # Test colabfold 

Run MSA locally with MMseqs2

Prerequisites

If you would like to speed up the run time and run MMseqs2 locally on Isambard instead of using the MMseqs2 remote server please follow the installation instructions below. You can optionally add mmseqs2 to the conda environment above.

These commands create a conda environment named mmseq, installs mmseqs2 and tests it with an example DB.fasta:

conda create -n mmseq python=3.10
conda activate mmseq
conda install -c conda-forge -c bioconda mmseqs2
wget https://raw.githubusercontent.com/soedinglab/MMseqs2/refs/heads/master/examples/DB.fasta
mmseqs easy-cluster DB.fasta clusterRes tmp --min-seq-id 0.5 -c 0.8 --cov-mode 1