Skip to content

Isambard Summit 2026 — Full Programme

Draft programme

This programme is a draft and is subject to change. Speakers, titles, timings, and abstracts may be updated before the event.


Day 1

Session 1: Welcome and Keynote | 10:00–11:15

Simon McIntosh-Smith (Bristol Centre for Supercomputing)

10:00 — Welcome and introduction to Bristol Centre for Supercomputing


UK Government Representative

10:10

Speaker and title to be confirmed.


Fred Manby (Iambic)

10:20 — Building the NeuralPLexer4 Co-Folding Model

We are developing NeuralPLexer4, a next-generation biomolecular co-folding model that predicts the atomic structure of proteins bound to drug-like molecules. Building on NeuralPLexer3, already a world-leading model, by scaling the model architecture, integrating experimental and non-structural signals such as binding affinity and potency, and incorporating physics-based synthetic data — an effort enabled by a generous award of time on the Isambard AI facility.


Richard Gilham (Bristol Centre for Supercomputing)

11:05 — Conference tracks and dinner


Session 2: Large Language Models and AI Research | 11:45–13:00

Pontus Stenetorp (University College London)

11:45 — UK-LLM after three years: Reflections and a road map for the future

UK-LLM (previously BritLLM) became the first effort to train a large language model (LLM) solely on British computational resources in 2023. Since then, we have conducted three main releases, the latest of which outperforms even large-scale, commercial models for Welsh. This has been made possible by a combination of factors: readily available national compute resources, close collaborations, and cutting-edge scientific innovation. In this talk, I want to take a moment to reflect on how we have navigated the fast-moving LLM landscape and lastly to lay down a road map of what we hope to accomplish with UK-LLM over the next few years.


Aleksej Zelezniak (King's College London)

12:10 — AI for Synthetic Genome Design

We believe biology is programmable. The central dogma defines an information-processing system that can be modelled, learned, and engineered. If we achieve quantitative control over the flow from DNA to protein, we enable scalable therapeutics and sustainable biomanufacturing.

Our research integrates generative machine learning with synthetic biology to move from sequence analysis to sequence design. We focus on two core challenges: (i) learning sequence-to-expression relationships to generate regulatory DNA with predictable output, and (ii) exploring protein sequence space to engineer improved function.

We develop quantitatively predictive and generative models that capture mechanistic structure in biological sequence landscapes. These models are embedded in experimental validation pipelines, forming closed learning loops where design, testing, and retraining continuously refine performance. In this talk, I will present recent results and outline a roadmap towards programmable, model-driven biological engineering.


Huw Day (University of Bristol)

12:35 — Understanding Partitioned Learning Dynamics

Federated Learning occurs in situations where a model needs to be trained on data which cannot all be stored in the same place. This frequently occurs in domains with private data such as healthcare or finance. Continual Learning occurs in situations where a model gets shown different data over time. A model trained to predict the weather from data 50 years ago will be worse off today due to the underlying change in weather from climate change. In previous theoretical work, we have outlined how both of these two learning paradigms fall under the umbrella term of "Partitioned Learning". Thanks to Isambard AI we are able to simulate these complex systems to compare how their learning dynamics vary with respect to their data distributions.


Session 3: AI for Health | 14:00–15:15

Aldo Faisal (Imperial College London)

14:00 — Nightingale-AI

Nightingale AI is an AIRR flagship effort to build sovereign, open medical foundation models. Unlike language-only models, medical AI must learn across unified multimodal data — imaging, biosignals, genomics, and clinical text — demanding innovations in architecture, scaling, and interpretability. Leveraging exascale compute, Nightingale AI pioneers an "AI factory" approach that is capable of fusing national-scale datasets with immediate healthcare impact. This talk will share an overview of our work today, and how we have partnered from day one with Isambard AI at an unprecedented scale of compute for academic and medical research teams.


Jon Lees (University of Bristol)

14:25 — AI as a Bridge Across Cellular Scales

Understanding how cells work requires us to connect information across different spatial scales, from molecular interactions to whole-cell states. In this talk, I will present how artificial intelligence can act as a unifying bridge across these cellular scales. We have been using Isambard AI to predict structural interactions of proteins and improve the inference of macromolecular assemblies; scaling these approaches requires substantial compute. Using experimental Cryo EM data to validate predicted structural models we are running the image processing pipeline workflows efficiently on Isambard AI. Finally, I will highlight AI-driven approaches for classifying cellular states in both healthy and disease contexts.


Gregory Verghese (PharosAI, King's College London)

14:50 — Towards Transparent AI in Computational Pathology: Multimodal Concept Learning for Clinical AI

Artificial intelligence (AI) in pathology promises to transform precision oncology, yet clinical adoption remains limited by the opacity of deep learning systems. Without transparent reasoning, high-performing models risk eroding clinician trust and regulatory approval. Concept-based approaches improve interpretability by structuring predictions around clinically meaningful variables; however, most rely on binary representations that fail to reflect the inherently categorical nature of clinical knowledge.

We propose C²EM, a categorical concept embedding model trained on approximately 3,800 patients across 10 cancer types from The Cancer Genome Atlas (TCGA) using whole-slide histopathology images to predict survival. The model learns mutually exclusive multimodal concepts derived from clinical variables (age, tumour stage, cancer type) and a pan-cancer transcriptomic prognostic biomarker, preventing incompatible co-occurrence and reducing concept–task and inter-concept leakage. Concept-specific attention heads enable per-concept visual interpretability, while a post-concept attention mechanism aggregates concept importance for Cox survival prediction.

C²EM achieves state-of-the-art concept prediction performance over the baseline CEM while maintaining comparable survival discrimination (C-index ~0.7) and enhancing interpretability through reduced concept–task and inter-concept leakage, yielding more faithful and disentangled representations. Furthermore, we highlight the practical benefits of concept-based reasoning through simulated clinician interventions, correcting erroneous concepts to produce monotonic improvements in survival performance, underscoring the effectiveness of our clinician-in-the-loop framework.

By aligning model structure with categorical clinical knowledge, C²EM demonstrates strong predictive performance, interpretability, and robust human-in-the-loop refinement. This framework aligns model interpretability with clinical reasoning, enabling faithful and verifiable concept representations that clinicians can interrogate and correct, supporting trustworthy deployment of AI systems in pathology.


Session 4: AI for Advanced Materials | 15:45–17:00

Matthew Foulkes (Imperial College London)

15:45 — Neural Wavefunctions for Materials Chemistry and Physics

The dream of bypassing complex and expensive experiments in materials chemistry and physics by solving the quantum mechanical many-particle Schrödinger equation using computers has animated scientists for decades. Although we are sure that this would work in principle, computing exact solutions is difficult and practical methods rely on approximations that cannot easily be tested.

Over the past few years, we have pioneered a new approach to this problem, approximating many-particle wavefunctions as deep neural networks and learning the parameters using the variational principle, without requiring externally generated data. This produces very accurate results and sometimes unveils features of chemistry and physics we had not anticipated. This talk will introduce the approach and describe some of our recent work on superconductors, quantum Hall systems, altermagnets, positron annihilation, and muon spin resonance.


Gabor Csanyi (University of Cambridge)

16:10 — MACE force field models for the periodic table

I will report on our latest efforts to create universally applicable machine learning force fields using the MACE architecture. Large publicly available databases (such as OMAT and OMOL) and large-scale GPU compute allow the construction of force field models that cover most of the periodic table and are suitable out of the box for exploration tasks, and in some cases (e.g. organic molecules) for accurate production-level simulations. Fine tuning material models with very little effort yields near-DFT accuracy. The latest models include electrostatic interactions with some notion of self-consistency.


Panel: Access to Scale

16:35 — SME focus panel: title to follow

Moderator: Jessica Driscoll (NVIDIA)

Panellists: Richard Gilham (Bristol Centre for Supercomputing), Jamil Appa (Zenotech), Wasil Rezk (BeyondMath), Edward Inns (Cambridge Innovation Capital)

For deep-tech startups, the path from proof-of-concept to production is often paved with prohibitive costs and hardware scarcity. This panel tackles the single biggest hurdle facing UK founders: access to scale. Join us for an informative discussion between Isambard's infrastructure leaders, VCs, and startup pioneers as we learn how to leverage Isambard's subsidised GPU allocations to de-risk your technology, what investors demand before writing the cheque, and how to turn sovereign compute into your competitive advantage.


Day 2

Session 5: Keynote | 09:00–10:15

Welcome Day 2 (Bristol Centre for Supercomputing)

09:00

Speaker to be confirmed.


Jeffrey S. Vetter (Oak Ridge National Laboratory)

09:05 — Keynote: Title TBC

Abstract to follow.


David Topping (University of Manchester)

09:50 — Partnerships at Scale: HPC, AI and the Future of Environmental Decision-Making

High-performance computing is no longer just about simulation; it is becoming the foundation for training environmental intelligence at scale. In this talk, we present our work using Isambard to train NVIDIA's CorDiff model as part of the PolluGen project, demonstrating how diffusion-based AI can transform air pollution modelling and accelerate environmental insight.

But this is only the beginning. We situate this work within a broader research programme exploring how large language models and AI-driven discovery tools can reshape how researchers search, synthesise, and generate environmental knowledge. Together, these efforts point toward a new paradigm: HPC not only as infrastructure for computation, but as an engine for discovery.

Crucially, delivering impact in this space depends on effective partnerships. Bringing together academia, policymakers, research councils, and technology vendors is essential to translate technical capability into societal value. We will reflect on what we have learned about building these collaborations and why they are central to the UK's ambition in AI-enabled environmental science.


Session 6: AI Security | 10:45–12:00

Jason Gwartz (AI Security Institute)

10:45 — An Introduction to the AI Security Institute and AI Safety Research on Isambard

The UK AI Security Institute is one of the leading institutions researching AI safety and risk. Working directly in the UK government, AISI's research agenda covers topics like cybersecurity, biological weapons, and AI loss of control — this work directly informs the UK government about the near-term and longer-term risks from frontier AI.

Since the earliest Isambard AI pilots, AISI has been using Isambard as a critical part of our AI safety research. In this session, we will highlight some of the recent projects and papers published by AISI that were powered by Isambard AI. We will also discuss AISI's usage patterns of Isambard, ranging from interactive daily coding use to large-scale training and fine-tuning jobs, and how we have accelerated our pace with AI coding agents. This session will serve as an introduction to other AISI talks at Isambard Summit where these topics will be examined in more detail.


Yalli Du (King's College London)

11:10 — Evaluating the cooperative behaviour of systems of generative agents

We study how hundreds of LLM agents behave collectively in social dilemmas. We propose an evaluation framework in which LLMs generate explicit algorithmic strategies, making agent behaviour inspectable before deployment and scalable to large populations. We find that newer models can produce worse societal outcomes than older ones when optimising for individual gain, and simulations of cultural evolution suggest a risk of convergence to poor collective equilibria, especially in larger populations and when cooperation is less rewarding.


Sid Black (AI Security Institute)

11:35 — Auditing games for sandbagging detection

This research tested 10 methods for detecting AI "sandbagging" (deliberate underperformance during evaluations) using a red team vs. blue team game. Overall, the auditing game revealed that current methods may be insufficient to reliably detect sandbagging. No silver bullet exists yet, and more effective methods require deep model access that external evaluators often lack.


Session 7: AI Research and Closing Address | 13:00–15:00

Zilin Wang (University of Oxford)

13:00 — Learning to Drive in New Cities Without Human Demonstrations

A key bottleneck of large-scale deployment of autonomous driving is the need to collect many human demonstrations when adapting driving policies to new cities. In this presentation, I will introduce NO data Map-based self-play for Autonomous Driving (NOMAD), which enables policy adaptation in a simulator constructed based on the target-city map. Using a simple reward function, NOMAD substantially improves both task success rate and trajectory realism in target cities, demonstrating an effective and scalable alternative to data-intensive city-transfer methods.


Eghbal Rahimikia (University of Manchester)

13:25 — Re(Visiting) Time Series Foundation Models in Finance

Financial time series forecasting is critical for trading, portfolio optimisation, and risk management but remains difficult due to noisy and non-stationary data. Time series foundation models (TSFMs) offer a new approach to learning generalisable temporal representations. This paper provides an empirical study of TSFMs in global financial markets using daily excess returns. We compare zero-shot inference, fine-tuning, and pre-training from scratch. Results show that off-the-shelf TSFMs perform poorly, while models pre-trained on financial data deliver significantly better forecasting and economic performance. View paper on SSRN


Bidipta Sarkar (University of Oxford)

13:50 — Evolution Strategies at the Hyperscale

Evolution Strategies (ES) is a class of powerful black-box optimisation methods that are highly parallelisable and can handle non-differentiable and noisy objectives. However, naïve ES becomes prohibitively expensive at scale on GPUs due to the low arithmetic intensity of batched matrix multiplications with unstructured random perturbations. We introduce Evolution Guided GeneRal Optimisation via Low-rank Learning (EGGROLL), which improves arithmetic intensity by structuring individual perturbations as rank-r matrices, resulting in a hundredfold increase in training speed for billion-parameter models at large population sizes, achieving up to 91% of the throughput of pure batch inference. We provide a rigorous theoretical analysis of ES for high-dimensional parameter objectives, investigating conditions needed for ES updates to converge in high dimensions. Our results reveal a linearising effect, and proving consistency between EGGROLL and ES as parameter dimension increases. Our experiments show that EGGROLL: (1) enables the stable pretraining of nonlinear recurrent language models that operate purely in integer datatypes, (2) is competitive with GRPO for post-training LLMs on reasoning tasks, and (3) does not compromise performance compared to ES in tabula rasa RL settings, despite being faster. Code is available at eshyperscale.github.io.


Eltayeb Ahmed and Anya Sims (University of Oxford)

14:15 — Reinforcement Learning for Mid-Training on Unstructured Text

Reinforcement learning (RL) has been shown to improve reasoning capabilities in LLMs. However, RL for LLMs currently relies heavily on high-quality, curated question-answer datasets. These are prohibitively expensive to produce at scale, thus limiting scalability. To overcome this, we investigate training LLMs with RL on cheap, readily available unstructured text. We do this by producing "fill in the gaps" (FIG)-style questions, in which a random section of text is removed and the model is tasked with using chain-of-thought reasoning to reconstruct the missing content. This effectively transforms readily available arbitrary text into challenging, diverse questions at zero cost. A judge LLM is then used for the comparatively easier task of comparing the model's guess to the ground truth to give a reward for RL training. We find that this RL mid-training significantly boosts performance. Moreover, we find a marked difference in the downstream training behaviour of the RL mid-trained models, with RL mid-training significantly improving stability during subsequent RL from verifiable rewards (RLVR), leading to significant improvements in final performance.


Annex: SME Focus in Partnership with NVIDIA

Day 1

Welcome Annex Day 1 (Bristol Centre for Supercomputing)

11:45 — Welcome, Bristol Centre for Supercomputing introduction and aims for the day

Speaker to be confirmed.


Jessica Driscoll (NVIDIA)

12:00 — Accelerating AI Innovation: The NVIDIA Inception Program and the Supercomputing Ecosystem

As the United Kingdom continues to strengthen its position in computational science through the Isambard AI initiative, the convergence of high-performance computing (HPC) and commercial artificial intelligence (AI) development emerges as a critical driver of industrial innovation. This presentation introduces NVIDIA Inception, a global programme dedicated to supporting startups operating at the forefront of AI and deep technology advancement.

The session examines how Inception facilitates the translation of research into commercial applications by equipping members with essential resources for technological and business scaling. By aligning the computational capabilities of the Isambard infrastructure with the technical expertise and market enablement provided by Inception, UK-based startups are uniquely positioned to accelerate progression from proof-of-concept to international deployment. Participants will gain insights into leveraging this integrated ecosystem to optimise AI workloads and catalyse the next wave of breakthroughs in scientific discovery and industrial transformation.


Jamil Appa (Zenotech)

12:20 — Title TBC

Abstract to follow.


Wasil Rezk (BeyondMath)

14:00 — Title TBC

Abstract to follow.


Karin Sevegnani (NVIDIA)

14:25 — From Infrastructure to Impact: Sovereign AI Development on Isambard AI

Strategic investments in AI infrastructure are transforming how nations build sovereign AI capabilities. This talk explores how Isambard AI enables researchers to develop AI systems that reflect local languages, cultures, and regulations using NVIDIA Nemotron models and frameworks.

We will examine key success stories demonstrating infrastructure impact. For example, the UK-LLM project trained the first sovereign Welsh language model, achieving 87% accuracy on Welsh benchmarks while maintaining English performance. Using NVIDIA NeMo and Nemotron Nano 2, the project demonstrated how continuous pre-training enables rapid language adaptation for low-resource languages.

Throughout, we will detail the technical stack: NVIDIA NeMo for training, NeMo Curator for data processing, and production-ready frameworks that transform infrastructure into deployed AI solutions for real-world public services including healthcare, education, and legal resources.


Day 2

Tim Santos (Graphcore)

10:45 — What changes when you go from 1 node to 'lots'

Abstract to follow.


Ian Johnson (HPE)

11:10 — Securing Containerised AI Kubernetes Workloads on Isambard AI with Slingshot

Research supercomputing facilities require secure federation, multi-tenancy, containerisation, and support for AI workloads. Adopting a cloud-native approach to HPC-AI systems means maintaining HPC performance, especially effective use of high-speed interconnects, whilst allowing the deployment practices of cloud workloads. This presentation demonstrates recent work on Isambard AI extending the HPE Slingshot software to support secure, containerised, multi-tenant RDMA access for Kubernetes-based Trusted Research Environments. We illustrate how Kubernetes workloads can achieve network isolation via VNI-based Slingshot capabilities while maintaining the low overhead and performance expected of modern AI-focused supercomputing platforms.


Duncan Roweth (HPE)

11:35 — AI Inference with NVIDIA Dynamo on HPE's Slingshot Network-based Systems

The convergence of HPC and AI creates unprecedented opportunities for infrastructure providers who can deliver portable, high-performance communication middleware. NVIDIA Dynamo is rapidly gaining traction as the next-generation AI inference engine, integrated into vLLM, LMCache, and major serving platforms, fundamentally requiring efficient GPU-to-GPU, GPU-to-storage, and GPU-to-KV-cache communication primitives. NVIDIA NIXL (NVIDIA Inference Xfer Library) has emerged as the standard layer for all AI communication workloads. This abstract presents the first integration of HPE Slingshot networking with NIXL through a libfabric backend, enabling NVIDIA Dynamo and vLLM to seamlessly leverage HPE's differentiated networking capabilities.


Pili Mayora and Dan Lenton (AI Security Institute)

13:00 — AISI Research Platform: Isambard technical workflows

AISI's Core Technology team are responsible for the AISI Research Platform, our AWS-based internal development platform for AI Safety researchers. This talk will cover technical integrations and workflows for Isambard built for the platform, including automated setup (Ansible) and tunnelling from the platform (VSCode, HTTP, Slurm jobs).


Sadaf Alam (Bristol Centre for Supercomputing)

13:25 — AIRR Status and AI Data Facility Update

Contributors: Simon McIntosh-Smith (Bristol Centre for Supercomputing), Ritchie Somerville (EPCC), Paul Calleja (University of Cambridge)

This session provides an update on the AIRR compute facilities at Bristol and Cambridge and the upcoming AI Data Facilities at Bristol and EPCC. The sites will provide brief updates followed by Q&A from the audience.


David Africa (AI Security Institute)

13:50 — Consistency Training

There is a large class of methods — consistency training — which encourage AI models to give similar answers across different prompts or sampling strategies in order to improve capabilities and reliability. However, we think these methods also affect how aligned or misaligned the models are: if a model that is already misaligned is made to be more consistent with itself, does it make it even more misaligned? We tested seven consistency methods across four types of AI misbehaviour and found striking differences: these techniques reliably reduce reward hacking (80% of runs) and emergent misalignment (76%), but actually make sycophancy worse (only 12–42% show improvement). The key insight is that some misaligned behaviours, like reward hacking, are fragile and break down when you ask the model in different ways. But some are coherent strategies that stay consistent across perturbations, so consistency training reinforces them. This means practitioners cannot assume consistency methods are safe capability improvements and need to evaluate alignment effects for their specific use case, since the same technique that fixes one problem may amplify another.