- ✓ AIP1 Isambard-AI Phase 1 supported
- ✓ AIP2 Isambard-AI Phase 2 supported
- ✗ I3 Isambard 3 unsupported
- ✗ BC5 BlueCrystal 5 unsupported
Alphafold
Alphafold 2¶
To install Alphafold 2 we recommend using ColabFold on Isambard-AI as it is the easiest way to access AlphaFold 2's protein structure prediction capabilities.
Key advantages:
- No AlphaFold installation needed: ColabFold includes its own lightweight implementation of AlphaFold2's Jax inference pipeline, so you can use the models without installing the full AlphaFold software stack
- Flexible MSA generation: Uses MMseqs2 for multiple sequence alignment searches, which can run against remote servers or locally for faster performance.
- Additional control: Offers more adjustable parameters than standard AlphaFold2
The sections below will guide you through setting up ColabFold and optionally configuring local MMseqs2 databases for improved performance.
Alphafold 2 Dataset¶
The full Alphafold 2 dataset is available on Isambard-AI in the directory /projects/public/brics/data/bio/alphafold.
user.brics@login44:~> ls /projects/public/brics/data/bio/alphafold
bfd mgnify params pdb70 pdb_mmcif pdb_seqres uniprot uniref30 uniref90
Colabfold¶
Prerequisites
- Have followed the instructions to install Conda in your user storage space
- Have followed the container introduction.
We provide options to install Colabfold with conda or as a singularity container below.
Conda¶
To install colabfold on Isambard-AI with Conda let's first copy the environment definition file:
name: colabfold
channels:
- conda-forge
- bioconda
dependencies:
- python=3.12
- pip
- pip:
- jax[cuda12]
- tensorflow
- git+https://github.com/sokrypton/ColabFold.git#egg=colabfold[alphafold]
You can download the file here: colabfold-env.yaml
We can then create the environment on a compute node:
$ srun --gpus 4 -N 1 --pty bash
$ conda env create -f colabfold-env.yaml
$ conda activate colabfold
# Check Jax is using the GPU
(colabfold) $ python3 -c "import jax; print(jax.default_backend())"
Finally test colabfold_batch on a small P54025.fasta file:
(colabfold) $ wget https://github.com/sokrypton/ColabFold/raw/main/test-data/P54025.fasta
(colabfold) $ colabfold_batch P54025.fasta output/ # Test colabfold_batch
Singularity¶
To run colabfold with singularity let's first create a colabfold.def singularity container definition file and insert the following:
Bootstrap: docker
From: nvcr.io/nvidia/jax:24.04-py3
%post
python3 -m pip install tensorflow jax[cuda12]
python3 -m pip install --no-warn-conflicts 'colabfold[alphafold] @ git+https://github.com/sokrypton/ColabFold'
%environment
export PATH=${HOME}/.local/bin:$PATH
unset XLA_FLAGS
%runscript
conda deactivate >/dev/null 2>&1 || true
exec /bin/bash -i
Click here to download the file: colabfold.def
Then build and run the container:
$ singularity pull jax2404py3.sif docker://nvcr.io/nvidia/jax:24.04-py3
$ singularity build --fakeroot colabfold.sif colabfold.def
$ srun -N 1 --gpus 4 --pty singularity run --nv colabfold.sif
Finally test colabfold_batch functionality inside the container on a small P54025.fasta file:
$ Singularity> wget https://github.com/sokrypton/ColabFold/raw/main/test-data/P54025.fasta
$ Singularity> colabfold_batch P54025.fasta output/ # Test colabfold
Run MSA locally with MMseqs2¶
Prerequisites
- Have followed the instructions to install Conda in your user storage space
If you would like to speed up the run time and run MMseqs2 locally on Isambard instead of using the MMseqs2 remote server please follow the installation instructions below. You can optionally add mmseqs2 to the conda environment above.
These commands create a conda environment named mmseq, installs mmseqs2 and tests it with an example DB.fasta:
conda create -n mmseq python=3.10
conda activate mmseq
conda install -c conda-forge -c bioconda mmseqs2
wget https://raw.githubusercontent.com/soedinglab/MMseqs2/refs/heads/master/examples/DB.fasta
mmseqs easy-cluster DB.fasta clusterRes tmp --min-seq-id 0.5 -c 0.8 --cov-mode 1