Cloud Native

Bridging the AI Infrastructure Gap: How Mirantis and Gcore Are Democratizing Enterprise AI Deployment

0

The artificial intelligence revolution has reached a critical inflection point. While organizations worldwide rush to implement AI solutions, a significant infrastructure gap threatens to derail their ambitious plans. The challenge isn’t just about training models—it’s about deploying them at scale, seamlessly, and in production environments that can handle real-world demands.

In a recent episode of “An Eye On AI”, this infrastructure challenge took center stage as host Swapnil Bhartiya explored the evolving landscape with two industry veterans who have witnessed multiple waves of technological transformation. Alex Freedland, Co-Founder and CEO of Mirantis, and Seva Vayner, Product Director of Edge Cloud and AI at Gcore, shared insights from their strategic partnership that promises to simplify AI infrastructure deployment at enterprise scale.


📹 Going on record for 2026? We're recording the TFiR Prediction Series through mid-February. If you have a bold take on where AI Infrastructure, Cloud Native, or Enterprise IT is heading—we want to hear it. [Reserve your slot

The Evolution from OpenStack to AI: Lessons from the Past

Freedland emphasized that AI deployment faces the same scalability challenges that OpenStack addressed for cloud computing: “AI has to be delivered at scale and seamlessly, right? That’s the only way AI will be consumed.” The lesson from OpenStack’s evolution is clear—infrastructure complexity must be abstracted away from end users to enable widespread adoption.

The Inference Challenge: From Training to Production

While much of the AI industry focus remains on training models, the real challenge lies in inference deployment. Gcore, as a global AI infrastructure provider, has helped enterprises navigate AI adoption. Now, with the increased demand for AI inference, Gcore Everywhere Inference helps businesses efficiently leverage their resources when deploying AI inference, improving time-to-market and ROI from AI projects.

Vayner highlighted a critical gap in the market: “ML engineers are not infrastructure or operations engineers, and they typically don’t focus on Kubernetes, creating Helm charts, or developing custom resources.” This disconnect between AI expertise and infrastructure knowledge creates friction that hinders the rapid deployment and scaling of AI applications.

The solution, according to both executives, lies in creating serverless platforms that abstract away the complexity of Kubernetes management while maintaining enterprise-grade performance and security.

The NVIDIA Ecosystem and the Gold Rush Mentality

The discussion revealed interesting insights into NVIDIA’s role in the current AI infrastructure landscape. Freedland noted that NVIDIA recognizes they “can’t teach cloud people how to be cloud people,” which is why they rely on ecosystem partners like Mirantis and Gcore to deliver complete solutions to their customers.

This creates what Freedland called a “gold rush” scenario, where the focus shifts to providing “picks and shovels”—the fundamental infrastructure tools that enable AI deployment rather than just the AI capabilities themselves. The partnership between Mirantis and Gcore represents this approach, combining cloud-native AI infrastructure expertise with global edge deployment capabilities.

The Intelligence Delivery Network: A Global Approach

One of the most significant developments discussed was Gcore’s partnership with Northern Data Group, creating what they call the “Intelligence Delivery Network.” This collaboration delivers edge AI solutions for enterprise clients and model developers, targeting the growing AI inferencing market opportunity.

This partnership brings together Northern Data’s 35,000 GPUs across Europe with Gcore’s global network capabilities, creating a distributed AI infrastructure that can serve models at scale across multiple continents. The approach addresses one of AI’s most challenging requirements: low-latency inference deployment that can serve users regardless of geographic location.

Democratizing GPU Access and Training

Beyond inference, the partnership aims to democratize access to high-end GPU infrastructure for training and fine-tuning models. Vayner explained their vision as providing “super simplified access to model training on high-end GPUs with virtually unlimited interconnectivity, where customers can deploy their clusters in just a few clicks.”

This approach echoes the evolution of Kubernetes platforms where complex orchestration was simplified through managed services and automated workflows. The goal is to make enterprise-grade AI infrastructure as accessible as public cloud services while maintaining the control and security required for sensitive workloads.

The Path Forward: Standards and Sovereignty

Looking ahead, both executives emphasized the importance of creating industry standards for AI infrastructure deployment. Freedland predicted that “a standard stack could be delivered everywhere and run everywhere” within the next year or two, with support from the Linux Foundation.

This standardization effort addresses the growing demand for sovereign AI infrastructure—systems that organizations and nations can control completely while still benefiting from global scale and efficiency. The approach mirrors successful infrastructure standardization efforts in previous technology waves, from virtualization to containerization.

Implications for Enterprise Strategy

For enterprise technology leaders, the discussion highlights several critical considerations for AI infrastructure planning. The gap between AI development expertise and infrastructure operations requires either significant internal capability building or partnerships with specialized providers who can bridge this divide.

The conversation also underscores the importance of thinking beyond initial AI pilots to production-scale deployment scenarios. Organizations that successfully navigate this transition will likely be those that plan for inference scalability from the beginning rather than treating it as an afterthought.

As AI continues its rapid evolution from experimental technology to business-critical infrastructure, partnerships like the one between Mirantis and Gcore represent a pragmatic approach to managing complexity while maintaining enterprise requirements for security, performance, and control. The future of AI infrastructure may well depend on such collaborative efforts that combine deep technical expertise with global scale and reach.

For more insights on enterprise AI infrastructure and cloud-native technologies, visit TFiR for the latest analysis and expert interviews.


Edited Transcript

Swapnil Bhartiya (00:00): This actually reminds me—it echoes from the OpenStack days, you know, when in Europe T-Mobile in China, they were building their own private cloud. And now, especially with the changing political landscape, there are a lot of countries that would like to build their own infrastructure. Everybody wants to leverage AI, but there are a lot of challenges to running it at that scale. So can you also talk about that? While this Netherlands case might be a niche, you might see much wider adoption of this approach, maybe even within the US. A lot of organizations do want that capability. And going back to your expertise in OpenStack experience from that era, then also Kubernetes, and now AI—so once again, it’s more like looking at the future, where the world is heading, not where the world is. So what do you have to say about that?

Alex Freedland (00:54): I mean, you’re exactly right. AI has to be delivered at scale and seamlessly, right? That’s the only way AI will be consumed. And today, most people just do training, right? They haven’t really figured this out yet. And what we’re seeing is people are trying to run AI as a SaaS app on ChatGPT, right? And they create their own region. And those are early days of how you start. But what’s going to happen when you really kind of lean in and real agentic applications start to hit, and they will have to be connected to the data that lives on-premises, or it lives in other special circumstances or is generated at the edge? Scaling this will become a major, major challenge, right?

And when we spent two weeks ago—we spent a week in San Jose at NVIDIA GTC, right? And NVIDIA, of course, is pushing that inferencing is a big deal now, right? Because unless there is adoption of AI, all those infrastructure investments are not going to pan out, right? And so what NVIDIA is doing is they’re giving you endpoints and their software and everything—they’re giving you endpoints to which you can connect to build and enable the applications, whether it’s NVIDIA Cloud Functions or NIMs, as they call them, right? Inference microservices. This is what NVIDIA gives you, and then underneath they give you hardware. What’s in the middle is actually cloud-native stuff, and it has to be serverless, and it has to be available.

And what NVIDIA is saying is we’re not going to be able to teach cloud people how to be cloud people, so they rely on the ecosystem of people like us to deliver that to their customers. And we’ve had joint conversations with the most massive providers, the likes of the ones you mentioned—I can’t mention names—who are ready to adopt. And then the conversation goes, “Oh, but here we need to have your Kubernetes layer.” And the AI people there who are ready to spend billions of dollars, they’re like, “We don’t know what Kubernetes is,” right?

So what’s going to happen is, there is this gold rush happening, but you have to be able to sell picks and shovels, right? And make sure they actually work and are available everywhere. And this is for us cloud people to be relevant, right? So the best way to bring this to customers is to find things that are already sold and bring them in unusual combinations. So this partnership—we explained this to NVIDIA, and they’ve seen pretty much everything that exists. It’s not something that they’ve seen before. So they’re really moving us forward to bring that into their reference architecture, to their customers, very, very quickly. We can’t speak about it yet, but this inferencing solution that Gcore has is really one of a kind, and you will see it presented in a lot of reference architectures blessed by the largest vendors out there, and of course, the open source communities and foundations.

Seva Vayner (04:33): Yeah, I would like to add more points related to the inferencing. First of all, we’re in the early stages at this moment, because everybody is trying to understand the main use cases—how AI can be adopted or help the end businesses, the end enterprises, the end companies, to help grow their business or potentially reduce time to market, etc. On the other hand, we have new departments—AI excellence centers inside the enterprises—which are trying to create those use cases and bring them into production level.

But one of the points is that ML engineers, they’re not infrastructure operational engineers, and they really don’t focus on Kubernetes—how to create the Helm charts, how to create the custom resources. So it’s some kind of complexity: how to work with the GPU infrastructure and how to manage the whole Kubernetes stuff. That’s why we came together with Mirantis, where Mirantis knows how to build distributed cloud-native GPU infrastructure, Kubernetes infrastructure, and our layer helps to provide the serverless platform where you don’t need to think about Kubernetes at all—how to create those Helm charts.

You have a super native UI, API, or SDK, or Terraform, where you can describe what kind of model you would like to run. We have open source models which we deliver as an application catalog. These are complex AI applications, which can be delivered in a few seconds. Or if it’s a company which trained their model or fine-tuned their model, and they would like to get it up and running immediately in production scale, it’s also like a few button clicks to deploy.

So we’re trying to simplify the way how you can get up and running enterprise-grade, production-scale inference deployments, without the super knowledge of how to build that or package that inside Kubernetes. But this is our work, and we would like to provide that in a super simple way. So this is the key goal.

First of all, we will see the rise of inference workloads, and it will be commodity stuff in a few years, but right now, we help to understand, first of all, how inference can help you. The second point is to build inference at production scale when you build your POC, but potentially be ready to have production scale which can be very easily integrated to your end business applications. So this is our ultimate goal.

Recently, we announced our strategic agreement with Northern Data. It’s one of the European GPU cloud providers with 35,000 GPUs in Europe. But they also have a lot of presence with data centers. And one of the key goals which we would like to deliver together is providing what we call the Intelligence Delivery Network—providing the inference platform around the globe using our network capabilities and software stack and GPUs around the globe for massive scaling for end customers.

Also, our ultimate goal is providing super simplified access to train the models based on high-end GPUs with InfiniBand interconnection, where customers can bring their clusters in a few clicks and train their jobs or train their models. Also fine-tuning, also the capabilities to run managed Kubernetes based on also GPUs or working nodes, so getting up and running the training jobs, and using the Kubernetes scheduler for that. And additionally, where customers can get up and running their Slurm clusters for distributed training jobs or any kind of open source MLOps platforms on top of our software stack.

So we democratize the access to GPU infrastructure for training, fine-tuning and inference, together with Northern Data. So this is the kind of collaboration which we have, and additionally with Mirantis. So we would like to provide the bridge also for enterprise customers who are looking for scaling their training jobs, but additionally to save in the proper way their data, to have the ability to run the same capabilities on-premises.

So this is also one of our ultimate goals as a company—to deliver our platform and the open source standards to bring their GPU bare metal environments very useful, optimized from the cost perspective, and easy to use for their end customers, for their ML engineers. And also for—that’s also what I would like to mention—for any other GPU server and cloud providers, telco providers, which can be able to provide the same capabilities for their end customers as a service. So we would like to fully automate—as the cloud, I don’t know, 10 years ago started, yeah, OpenStack started to provide very simplified access to their compute resources, network resources, or storage resources—the same way we see in AI, to simplify access to the compute, network, and storage for AI, specifically.

Alex Freedland (10:07): They have sovereign infrastructure, and we’re trying to build a standard stack that could be delivered everywhere and run everywhere. It will be optimized on theirs, but will be available on any infrastructure. Nobody is doing it today, but that’s where the standard will come in a year or two, with the help of the Linux Foundation.

What Happened Today June 2, 2025

Previous article

OpenAI Plans Major Upgrade for ChatGPT as Personal “Super Assistant”

Next article