Cloud Native

How vCluster Is Redefining Multi-Tenancy and GPU Efficiency for the AI Era: Insights from Simone Morellato

0

Guest: Simone Morellato (LinkedIn)
Company: vCluster Labs
Show Name: An Eye on AI
Topics: Kubernetes

AI has changed the shape of infrastructure faster than most teams expected. Kubernetes, once the universal abstraction for cloud workloads, is now straining under the rise of GPU-heavy, high-density, multi-tenant environments. In this conversation, Simone Morellato, Customer Success Lead at vCluster, explains why the industry is being forced to rethink the very architecture of Kubernetes — and why virtual clusters have become essential for the next phase of AI operations.

For more than a decade, Morellato has watched Kubernetes evolve from a simple container orchestration platform into the default foundation for enterprise infrastructure. Having been part of VMware’s Tanzu leadership during its formative years, he has seen the recurring themes, debates, and architectural challenges that shaped the ecosystem. But speaking at KubeCon in Atlanta, Morellato notes that something about this moment is fundamentally different. AI is not just a theme layered on top of Kubernetes — it is forcing a deep re-architecture.

He points to a striking contrast at the event: the Kubernetes community is talking about efficiency, sustainability, and shared infrastructure, while AI companies are racing to build enormous, resource-intensive data centers. “A lot of people are talking about being efficient and being green,” he says, “and that is the complete opposite of what I see the AI companies doing.” The tension between these two worlds is what makes the work at vCluster so timely.

The challenge begins with the simple reality that Kubernetes was never designed for GPUs. CPU scheduling and GPU scheduling behave very differently. GPU workloads lock resources for their full duration, and the hardware’s enormous, fixed memory footprint means even idle workloads can block others from running. Morellato explains that the community’s early abstractions — like namespaces — were never intended to provide the isolation, tenancy, or flexibility now required for AI workloads.

At VMware, his team saw these cracks early. To overcome the limitations of namespaces, they helped introduce Cluster API and DKG to give each team its own cluster. But that approach created a new problem: cost and inefficiency from replicating full clusters everywhere. “You need to create multiple clusters,” he recalls, “but creating clusters became very expensive.” vCluster builds directly on this experience, delivering the same autonomy but without replicating the entire cluster. Instead, the control plane runs inside a pod while still sharing the underlying resources. It’s lighter, faster, and dramatically more efficient.

AI accelerated everything. With organizations adopting NVIDIA DGX systems faster than Kubernetes could catch up, the industry confronted a new bottleneck: extremely powerful hardware used at extremely low utilization. Morellato describes DGX as “a Ferrari with only one seat.” A GPU workload attaches to the hardware and monopolizes the entire memory footprint until it completes. Without an intelligent multi-tenant architecture, the system sits idle most of the time.

This is where vCluster’s innovations with NVIDIA come in. Their joint work focused on reshaping DGX systems into highly shareable, flexible environments that behave more like the cloud. Internal engineering teams wanted a “click-a-button” experience for GPU clusters, similar to what developers are used to in managed cloud environments. But DGX systems ship rigid and limited. Only a few dozen are produced each year, and customers need to maximize their use across many internal teams.

vCluster enables this by decoupling the control plane, introducing advanced multi-tenancy models, and layering in isolation across compute, networking, and storage. The result is a system where teams can safely share GPUs, dynamically move workloads, and push utilization from around 20% to 80%. For companies dealing with expensive hardware and limited supply, the impact is enormous.

Morellato walks through the spectrum of new capabilities added over the summer — many of them driven directly by AI demands. Private nodes and auto nodes give teams the ability to provision compute without depending on central IT. Custom VM provisioning allows DGX users to break free from pod count limits, multiplying density by creating multiple virtual machines within a single server. Network isolation now extends down to layer two and layer three, ensuring workloads never interfere with each other. Storage and data controls enforce strict boundaries for sovereignty requirements, particularly critical for European and regulated markets.

Security also required rethinking. With container escape vulnerabilities top of mind, vCluster introduced vNode — a protective layer that makes a pod appear as if it is its own node. If an attacker escapes the container, they remain trapped within the virtual boundary, unable to access the real underlying node. It’s a powerful addition for GPU workloads and for multi-tenant clusters where isolation is mandatory.

Morellato notes that many of these innovations were a “natural evolution” of vCluster’s original design. Developers first adopted virtual clusters simply to get around the limits imposed by centralized platform teams. They could create ten clusters instead of one, without provisioning delays. Those same mechanics now underpin enterprise-grade isolation, security, and AI-ready cluster sharing.

Looking ahead, Morellato believes the industry’s next phase will be defined by efficiency rather than scale. While AI companies may feel no pressure to conserve resources today, that will change as costs rise, regulatory demands spread, and multicloud AI architectures mature. When organizations begin optimizing their GPU data centers the same way they optimized cloud spend, virtual clusters will become indispensable.

This evolution is already underway in regions where data sovereignty is non-negotiable. AI workloads bring new complexity to compliance frameworks like GDPR, the AI Act, and country-specific regulations. vCluster’s ability to enforce strict, configurable isolation across all layers positions it well for these requirements. Customers can tune their tenancy model from lightweight sharing to full single-tenant isolation, depending on workload sensitivity and compliance needs.

Morellato’s optimism comes from what he has seen inside the community. “The team is delivering production-grade enterprise software even as a small startup,” he says, noting that vCluster has already become a critical tool inside organizations he once worked for. At VMware, his own team adopted vCluster because it solved problems traditional clusters could not. Years later, he sees that same grassroots adoption happening across the industry.

AI may be forcing Kubernetes to change, but as this conversation shows, the groundwork for the next generation of multi-tenant, GPU-aware infrastructure has already been laid — and vCluster is shaping much of that path.

JDK 25: Quantum-Safe Java and Smarter Observability — Simon Ritter, Azul

Previous article