When code is cheap, performance is expensive.
Supercomputing for Artificial Intelligence is a systems-oriented guide to understanding what really happens when AI code runs at scale.
Learn how to train deep learning models and LLMs on GPUs, clusters, and supercomputers-and how to reason about performance, scalability, and cost.
Supercomputing for Artificial Intelligence is a practical, systems-oriented guide to mastering the infrastructures, tools, and execution trade-offs involved in scaling deep learning systems-from neural networks to large language models (LLMs).
Designed for graduate students, AI researchers, data scientists, and engineers, this book bridges the gap between high-performance computing (HPC) and real-world AI applications. In an era where AI tools can generate entire training pipelines in minutes, the real engineering challenge has shifted from writing code to understanding performance, scalability limits, and cost. Whether you're working in academia, industry, or exploring advanced AI on your own, you'll find clear explanations and hands-on examples built around tools like PyTorch, CUDA, MPI, SLURM, and multi-GPU distributed training.
With over 800 pages of rigorously tested content, this book develops the ability to reason about large-scale AI training through:
The foundations of supercomputing and its role in AI workloads
Practical GPU programming with CUDA and distributed systems
Parallel programming with MPI on modern clusters
Efficient training of neural networks, CNNs, and Transformers
Performance optimization for deep learning at scale
Distributed training with PyTorch DistributedDataParallel (DDP)
Building and scaling LLMs using real biomedical and NLP datasets
Jupyter, Google Colab, and Hugging Face workflows
Deployment and inference strategies for modern LLMs
All source code, configuration files, and job scripts are available in a public GitHub repository. The material is field-tested through years of teaching and research at the Barcelona Supercomputing Center, and can be applied on local GPU setups, cloud platforms, and HPC clusters.
This book is ideal for:
Instructors looking for practical material for AI and HPC courses
Students and professionals wanting to learn how to run AI at scale
Engineers transitioning from standard AI workflows to distributed environments and seeking system-level judgments
Researchers working on LLMs and interested in reproducible pipelines
Do I need a supercomputer to use this book? Not at all. While some examples are run on large systems like MareNostrum, some code is designed to scale-from a single GPU to a full HPC node. You'll find guidance for running experiments in Google Colab and containerized environments. The emphasis throughout is not on maximum scale, but on understanding when scaling makes sense-and when it does not.
Whether you're teaching AI, training models at scale, or simply curious about the invisible infrastructure powering today's most powerful AI systems, this book is your companion to understanding and leveraging supercomputing for artificial intelligence.
This is not a book about writing code faster. It is about understanding what happens when that code runs-on GPUs, across nodes, under real resource constraints.
What's inside:
800+ pages of real-world content tested in supercomputing classrooms
Hands-on examples with PyTorch, CUDA, MPI, and SLURM
Full GitHub access with ready-to-run scripts and datasets
Workflows adapted for Google Colab, and HPC clusters