Building AI infrastructure isn’t just about racking up GPUs—it’s about integrating every layer of the stack to deliver performance, scalability, and reliability. In this clip, Mirantis CTO Shaun O’Meara and VP Randy Bias unpack how the company’s AI Factory Reference Architecture is designed to solve for that complexity.
Supercomputing Roots, Enterprise Realities
Bias explains that GPU-based AI infrastructure resembles high-performance computing (HPC) more than traditional cloud. “You’re aggregating GPUs and memory into what looks like a single system,” he says. That means low-latency, non-blocking networks and scheduler-aware orchestration become essential.
Mirantis leverages this model to deliver tightly coupled environments that optimize east-west traffic, locality, and throughput—similar to what you’d expect from supercomputing workloads.
From Complexity to Composability
O’Meara expands on how the architecture addresses day-one operational complexity. “Most teams get racked and stacked systems that can take months to be usable,” he says. Mirantis’ design solves for this with a pre-integrated, code-driven approach built on k0rdent.
The architecture doesn’t just focus on GPUs—it layers in identity, DNS, and multi-tenancy from the start, allowing cloud providers and enterprises to deploy production-ready GPU clusters rapidly.
Why It Matters
With support for GPU grouping, slicing, InfiniBand, NVLink, and schedulers like Slurm, Mirantis is making it possible to stand up AI environments quickly—without compromising performance or governance.





