Guest: Saiyam Pathak (LinkedIn)
Company: vCluster Labs
Show Name: KubeStruck
Topics: Kubernetes, Cloud Native

Kubernetes and GPU integration doesn’t just work. That’s the hard truth facing platform teams trying to build private AI infrastructure on NVIDIA DGX machines. Saiyam Pathak, Head of Developer Relations at vCluster, has watched his company evolve from a virtual cluster platform to a complete solution covering the entire tenancy spectrum—from soft to hard multi-tenancy.

When Saiyam joined vCluster in 2024, the platform was focused on polishing existing features. But this year marked a dramatic shift toward innovation. The team launched a new tenancy model, private nodes capability, auto nodes, and NetRisk integration. The most significant development was standalone vCluster mode—a concept that fundamentally changes how virtual clusters operate.

Traditional vCluster deployments required a host cluster with virtual clusters running on top. Standalone vCluster flips this model. The standalone instance becomes the host cluster itself, with virtual clusters created on top of it. This architecture is purpose-built to power NVIDIA DGX machines and private AI infrastructure.

The GPU integration challenge runs deeper than most teams realize. Saiyam has immersed himself in GPU architectures while co-authoring a book on AI-ready platforms with Daniel from LearnKube. He’s studied the technical differences between L40s, A100, and H100 GPUs, understanding which support multi-instance GPU and which don’t. This knowledge directly informed vCluster’s product development.

Multi-instance GPU support requires significant engineering work. The Kubernetes integration doesn’t automatically handle the complexities of GPU sharing, isolation, and resource management. vCluster addressed this by building features across the entire tenancy spectrum. Early 2025 saw the introduction of the tenancy spectrum concept, spanning from soft multi-tenancy to hard multi-tenancy. Throughout the year, vCluster systematically covered each point on that spectrum with targeted features.

Private nodes were a critical missing piece. Teams couldn’t deploy private nodes with vCluster before this year. Now they can, enabling more secure and isolated workloads. Auto nodes followed, providing dynamic scaling capabilities essential for AI workloads with variable resource demands.

The NetRisk integration adds another layer of functionality, though the specific benefits extend beyond basic cluster management. Combined with standalone mode, these features create a comprehensive platform for enterprise AI infrastructure.

For platform engineering teams evaluating solutions for GPU workloads, vCluster’s evolution represents a maturation of the virtual cluster concept. The platform moved beyond basic isolation to address the specific challenges of AI infrastructure: GPU architecture variations, multi-instance support, dynamic scaling, and flexible tenancy models.

The message is clear: running AI workloads on Kubernetes requires purpose-built solutions. Generic cluster management falls short when dealing with DGX machines, expensive GPU resources, and the complex requirements of enterprise AI platforms. vCluster built those solutions by understanding the technology stack from the GPU firmware up through the Kubernetes API.

How vCluster Built Complete Multi-Tenancy Spectrum for AI Infrastructure | Saiyam Pathak

How Klutch Brings Self-Service Data to Kubernetes Without the Chaos | Julian Fischer, anynines

Why a Major AI Incident Is Coming in 2026 | Severin Neumann, Causely

How Klutch Brings Self-Service Data to Kubernetes Without the Chaos | Julian Fischer, anynines

Why a Major AI Incident Is Coming in 2026 | Severin Neumann, Causely

You may also like

Why Browsers Are Breaking Your Marketing Data — and How Akamai Fixes It at the Edge | TFiR

Why Did JDK 26 Remove the Applet API and What Does It Say About Java’s Cloud-First Identity? | TFiR

Single Cloud Dependency Is A Disaster Recovery Nightmare | Dave Bermingham, SIOS | TFiR

Kubernetes Clusters Are Running Inside Commercial Airplanes—Here’s Why | Hong Wang, Akuity

API Attacks Surge 113% as Attackers Shift to Coordinated Multi-Vector Campaigns | Steve Winterfeld, Akamai

AI Code Floods Open Source: How Kusari Inspector Secures CNCF Projects Against Supply Chain Attacks | TFiR