Cloud Native

The Return of Bare Metal: Why AI and Repatriation Are Reshaping Enterprise Infrastructure

0

In a recent episode of TFiR’s Cloud : Evolution, Rob Hirschfeld, CEO and Co-Founder of RackN, sat down with host Swapnil Bhartiya to unpack the latest shifts in cloud computing, infrastructure automation, and the enterprise pivot toward AI.

A major highlight was the 2025 Red Hat Summit, where Red Hat doubled down on AI, OpenShift Virtualization, and Ansible Automation Platform. “OpenShift virtualization was top of mind,” said Hirschfeld, adding that RackN was one of the few vendors at the summit demonstrating a real OpenShift virtualization use case.

AI wasn’t just a buzzword—it was central. Red Hat showcased vLLM, a model-agnostic shim that allows developers to run inference workloads more efficiently. “They’re not trying to become a model provider,” Hirschfeld said. “They’re building the dev tooling and orchestration needed for enterprise AI.” He also highlighted Red Hat’s announcement of an AI router for model distribution—a game-changer for scalability.

On the infrastructure front, Hirschfeld emphasized the growing need for bare metal automation, especially for GPU-intensive AI workloads. “These are hot, failure-prone systems that require constant lifecycle management,” he said. With RackN’s tooling, companies can reduce reset times from days to hours. This speed isn’t just about efficiency; it translates directly into cost savings and improved ROI.

This move toward self-managed infrastructure is mirrored by a recent Broadcom report, which points to rising repatriation trends. While Hirschfeld agreed with the report’s thesis, he cautioned that Broadcom’s credibility—especially post-VMware acquisition—is questionable. Nonetheless, the trend is clear: enterprises are pulling back workloads to gain control, reduce costs, and enforce data sovereignty.

Hirschfeld also addressed the state of Terraform following HashiCorp‘s license change and IBM’s acquisition. “We’re not seeing the same excitement,” he noted. OpenTofu is gaining traction, but enterprises are more focused on supported products than open-source purity. IBM’s priority is clear: enhance their Red Hat ecosystem, particularly around Ansible and OpenShift.

Looking ahead, Hirschfeld sees immutable infrastructure, such as RHEL’s upcoming image-based deploy systems, as another major shift. These technologies demand operational knowledge many enterprises lack—an area where RackN is helping bridge the gap.

In a world increasingly dominated by AI and hybrid cloud complexity, RackN’s mission remains clear: simplify infrastructure by making bare metal as easy as cloud.

For more insights like these, explore TFiR’s full archive.


Edited Transcript

Swapnil Bhartiya: Another bright day here in Virginia, another episode of TFIR Cloud: Evolution. I’m your host, Swapnil Bhartiya. In this episode, we are joined by, once again, Rob Hirschfeld, CEO and co-founder of RackN, a company which is deeply embedded in infrastructure automation and hybrid IT.

Today we are going to dive into some of the biggest shifts happening across the cloud and open source landscape—all the way from Broadcom’s latest report on private cloud, to key takeaways from this year’s Red Hat Summit, including the role of AI and where enterprise is heading. We will also unpack the latest developments around Terraform and HashiCorp, and what does that mean for the future of infrastructure as code and for companies like RackN as well as their customers.

So without further ado, let’s jump right in. Rob, it’s great to have you back on the show.

Rob Hirschfeld: I’m excited to be here. There’s so much to talk about.

Swapnil Bhartiya: First of all, you folks were at Red Hat Summit. So I want to start with how was the summit? Talk a bit about the discussions you had on the show floor, concerns you had, what was the theme, and what was something you hadn’t seen in previous Red Hat summits that is now here. Just give us a total update on the summit.

Rob Hirschfeld: There was a lot going on. Red Hat is very focused on OpenShift. They’re very focused on AI. They’re very focused on Ansible Automation Platform. And those pieces all converge at the summit.

One thing I really liked about the summit is they talked a lot about customer success and customer stories. There were a lot of customers on stage talking about big virtual machine migrations. OpenShift Virtualization is top of mind for the people at the show and what people were talking about. Interesting to me, it’s early—so those things weren’t reflected in as many show sessions strictly about OpenShift Virtualization. I think we were the only vendors on the floor with a real OpenShift Virtualization story from the perspective of how to get you started and how to help with that.

So what we’re seeing is there’s a lot of new, exciting technology coming out of Red Hat right now, and we’re very excited about it. At the same time, how you execute on that technology is still being figured out. AI is a great example. The virtualization pieces are a great example. Ansible, which is the standard in DevOps automation—and Red Hat’s platform is Ansible Automation Platform, often used interchangeably.

I find there’s actually a lot going on with how it gets integrated with Terraform, which IBM had just completed the acquisition before the summit. So all those pieces are coming together at the show in a really interesting way.

Swapnil Bhartiya: Can you talk about HashiCorp? Because when the whole license change happened, you and I had a lot of discussion. OpenTofu was created, but with IBM’s acquisition, what has changed here?

Rob Hirschfeld: Not sure yet that very much has changed. The OpenTofu fork seems to be moving along, but the enthusiasm and excitement around Terraform or OpenTofu isn’t there in the market as much. A lot of the Terraform orchestrator buzz seems to have died down. The companies that are promoting with OpenTofu at the base seem to be doing their work, but it’s strange to me—we just don’t see the same level of enthusiasm for Terraform that there was a couple of years ago. Still very prevalent and very widely used, and there’s still enthusiasm to integrate it into other platforms, but as a standalone component, we’re just not seeing people talk about it as much.

Swapnil Bhartiya: I remember the early days when IBM spent billions of dollars on Linux, and they also created one of the most popular ads for Linux. Now when we look at this acquisition—HashiCorp—what was in there for IBM through this acquisition? Because of the license change, HashiCorp lost a lot of goodwill and image. At the same time, IBM also acquired Red Hat, and Red Hat runs as an independent company where IBM is seen as their client. They are still the open source company, so IBM didn’t touch anything that Red Hat was doing. So are they going to touch anything that HashiCorp is doing to fix the mistake they might have made?

Rob Hirschfeld: I can only give you what my crystal ball tells you on this one. One thing I actually think makes the IBM Red Hat integration work very well is that IBM is not shy about making money from product. So it really provided the “we’re going to sell product” pieces for Red Hat, a place to grow.

HashiCorp, I think, is a very different company. It was really multiple products that were loosely coupled together. So I think IBM’s going to do a great job with Vault. I think Nomad is competitive, so I think that’s a challenge. There’s a couple of bits and pieces that are interesting but self-maintaining, like Vagrant.

The big question becomes: what happens with Terraform? Do they open source it again, or merge it back into OpenTofu, or things like that? But Swapnil, I’ll tell you, I don’t think most enterprise customers care, which is what IBM cares about. They want supported products. They’re going to get supported products if they get Terraform. It’s not really a big open source hand-wringing dilemma from that perspective. It matters very narrowly in the industry—it matters if you’re HashiCorp and you’re competing with other companies that were using Terraform.

But from IBM’s perspective and Red Hat’s perspective, Red Hat makes money selling Enterprise Linux. They sell Kubernetes, badged as OpenShift, with a whole bunch of extra stuff on top. They sell Ansible Automation Platform. To the extent that Terraform can help them sell or make Ansible Automation Platform more powerful, they’re all in. To the extent that it’s integrated into other platforms, I don’t think there’s as much excitement about that.

One of the things I saw at Red Hat Summit is they have a very concrete plan for Ansible Automation Platform. It is the migration engine to move from VMware to OpenShift Virtualization. If you’re going to do that migration using Red Hat tools or with IBM consulting, you’re going to be using Ansible Automation Platform. So they have a lot invested in making that a great platform. Terraform is already integrated to that, and they’re going to extend those integrations.

From an enterprise consumer perspective, from our audience’s perspective, there’s no drama. While that’s not newsworthy—drama is newsworthy—no drama is actually what most customers want.

What does it mean for RackN? We have great integrations. We’re very focused on that bare metal side and providing that cloud of metal infrastructure and making that full lifecycle work. So to the extent that people want to use Ansible Automation to build and automate infrastructure, we provide the missing piece on the hardware lifecycle. It’s very important for us from that perspective. The same thing is true on OpenShift Virtualization, or OpenShift for AI and GPU farms, where it’s all bare metal operations—we’re providing the critical missing hardware lifecycle controls and AI-driven infrastructure that companies need to just make it work. The easy button for bare metal.

From that perspective, we’re filling a great gap. When IBM and Red Hat make those things easier to consume and provide the orchestration layers that customers need to drive a bigger story, every time that helps us, and we’re looking at accelerating the process improvements that you get from using these platforms. It’s stunning, and the ROI from process improvements is actually the real secret for anything bare metal. The process improvement is actually the secret to getting high ROI.

Swapnil Bhartiya: Of course, these days, one of the hottest topics is AI, and Red Hat is betting a lot on that. They are also changing the way their Linux distribution works. Talk a bit about one of the hottest news, not from the perspective of press, but from the lens of Rob. There’s a lot of things that came out because Red Hat Enterprise Linux 10 also came out. There’s some pieces about that that I want to get to.

Rob Hirschfeld: The thing that I really found remarkable is just how much Red Hat and IBM too—because they have the bona fides on AI—are doing to help companies really run the inference side of AI, because that’s really where enterprises are going to be focused. They’re not as worried about building large language models. They’re consuming these models and machine learning for smaller models.

What Red Hat announced at this summit was something they’ve been building for a while, and they’re really pulling together the pieces now. There’s something called vLLM, which I kept hearing as “V-Lim,” but it’s vLLM, which is a consistent shim layer for different models. So what it allows you to do is build your AI-enabled application, use vLLM, and then put in whichever models are appropriate or cost-effective for your system. So it actually gives you a lot of freedom in how you use the system.

I like this approach for Red Hat, because they’re not trying to become a model provider. They’re trying to become a dev tooling, optimizing platform. To the extent that they’re enabling developers to build AI applications, which is what we’re seeing over and over again, that’s a really significant win.

They also announced a way to take—it’s not vLLM, but it’s more like an AI proxy or an AI router—where you can take requests, filter it through the router, and it will find available running models, so that you can actually take and distribute workload across many machines. That’s been a missing piece for a lot of enterprises looking at how to scale up their LLM activity. So this basically lets you create a model farm. That coming technology, I think, is going to be really important for customers who are trying to scale their AI inferencing infrastructure. So really significant for that.

It’s also worth noting this didn’t come up in the show as much, which is surprising. But OpenShift, from our customer experience, is the preferred model OS and platform for AI training. So if people are building training infrastructure, they’re often turning to OpenShift as the supported version for this. So Red Hat’s winning in a lot of cases. The vLLM pieces and the model router pieces, I think, only make it easier for enterprises to adopt. That’s fundamentally a really significant win for people.

I should note vLLM is open source, so it’s a technology that’s widely available. A lot of companies are collaborating. It’s a great example of open source really winning for building a community and having a lot of different players. As new models come out, they’re very quickly plugged into the vLLM ecosystem, and people can take advantage of them from that perspective. So it’s sort of a necessary shim layer in community.

Swapnil Bhartiya: When I look at RackN, you folks deal with infrastructure bare metal. What does AI mean to you? And what do Red Hat’s AI announcements mean for RackN?

Rob Hirschfeld: It’s funny, because we actually see AI surfacing both from bare metal and also from some of the OpenShift Virtualization layers. I’ll explain both.

Anytime people are doing AI inferencing, they’re using very expensive servers because of the AI components. They often want to have many GPUs or GPU cards. They want to have very high-performance systems with complex networking. As the complexity of systems goes up, the need for more complex and integrated bare metal management gets very high.

AI systems still have high failure rates. GPUs often fail. They get run very hot. They need to be reset. They’re being patched with increasing frequency. So you need to be able to apply the correct firmware to your GPU. All of those operations are fundamentally bare metal operations, and the faster you can go through those, the more reliably you can go through those processes, the more ROI you’re going to get out of the servers.

We talk to people in the industry, and sometimes they’ll have servers that take three to five days for a reset because of the automation processes they have to go through. When we can take that down to under an hour, it’s transformative to how quickly they can set up and run workloads. So bare metal and AI are fundamentally linked technologies.

A lot of companies are looking at AI as having a data locality challenge to it, meaning they don’t want to be transiting data that they run through AI systems or building models all over the place. So data sovereignty, owning infrastructure, repatriation—all of those things are absolutely critical in the AI conversation.

So when enterprises talk about AI, they also talk about: Where am I going to run the models? How am I going to protect the data? How am I going to afford to run the infrastructure? How can I save money? Because AI servers are so expensive, and owning your servers is actually a very effective way—if you have a seven-month or longer expectation for running the systems, then buying the systems is actually a very cost-effective way to do it.

Repatriation and cost savings from self-hosting is very, very real, especially with RackN as an overlay, where we’re helping our customers negotiate discounts from OEMs—30% or higher. So pretty significant win from that perspective. That’s a lot of additional hardware you can buy when you get discounts.

One other point I wanted to make on the RHEL Summit—one other technology piece I didn’t fully talk about. So RHEL 10 and some of the work that Red Hat’s doing to drive towards immutable operating systems. CoreOS image deploy technology, what they used to call bootc, but they’re branding now as image deploy. These are technologies where you’re doing immutable boots, or you’re doing image-based deploys, based on container images. It’s incredibly powerful, very exciting technology that is going to be deeply embedded into RHEL going forward.

Customers who are using Red Hat Enterprise Linux today need to be thinking through how they’re going to be moving towards immutable deployments. This is something RackN is really helping customers set up, deploy, figure out, automate, because what we’re finding is that when customers look at OpenShift as a platform, they need to also be using CoreOS as the underlying OS for that cluster—not something that a lot of enterprises have experience with.

So you have to be able to deploy and manage CoreOS, which is managed differently than Red Hat Enterprise Linux. You have to be able to secure that. You have to be able to manage lifecycles for those systems. The image deploy is going to drive it another step further, where the systems actually aren’t built as images. They’re actually built up as containers and then applied to systems.

So there’s some really fascinating ways in which the whole RHEL lifecycle is becoming more developer-friendly. But we know very well from dealing with large enterprise that doesn’t mean the operators and the admins have the same tooling. They don’t always know what to do, especially when it comes to bare metal. So we’re very excited to see the improvements that RHEL 10 is bringing into the ecosystem, because I think they’re great improvements. They’re also things that we really provide competitive advantages for enterprises to adopt. So we make it much easier to adopt those technologies, and we make them much more robust and resilient from an operations perspective. So those are really significant changes coming to Red Hat customers.

Swapnil Bhartiya: Let’s talk about Broadcom. The report around private cloud. Have you seen the report? And if you have, what do you make out of it?

Rob Hirschfeld: I have definitely seen Broadcom’s survey talking about the increase in repatriation and self-managed infrastructure. It made the rounds pretty remarkably. A lot of people resonate with the idea that there’s additional bare metal infrastructure, self-managed infrastructure, coming back—that there is some cloud retreat. I think that our data, our experience in the field, aligns with what that research was saying.

The thing that we found interesting about it is it’s coming from Broadcom, and so it’s very hard to decouple the statements they were making from the source of the statements, because Broadcom has really burned a lot of credibility in market from that perspective. We see an incredibly broad-based retreat from VMware as the hypervisor of choice. So while Broadcom, I think, is correct in saying repatriation and self-managed infrastructure are very real for enterprises, I don’t think that infers that VMware is going to get pulled into that market opportunity.

This is by no way saying cloud is dead. Cloud is not dead. Cloud’s an important part of the whole mix here. But more and more companies are recognizing that they can save money by having their self-managed gear. They can improve security. They can have better data sovereignty. They can actually manage their infrastructure very effectively by using next-generation techniques to manage their infrastructure and revisit decisions that they were making to try and leave their 2000-era data center infrastructure.

Swapnil Bhartiya: Of course, you and I also talk about Kubernetes a lot, and I mean, because that’s one of the biggest open source projects, and one of the technologies that almost everybody uses. Just the way containers we use, these days everybody’s talking about AI, agentic AI. Can you talk about what are some of the—because Red Hat’s market is a very mature market. I mean, it’s been around for ages, 30-40 years since the kernel came out. But what are some of the pain points? As folks are moving to cloud, they don’t run their own infrastructure to some extent as well. But what are some of the pain points that you see—while this is a mature space, these are still some of the problems which also resonate as concerns throughout the industry, irrespective of whether you’re dealing with containers, whether you’re dealing with AI, whether you’re dealing with Kubernetes, edge, or infrastructure in general. Where are you like, these are universal challenges that we have to solve?

Rob Hirschfeld: When we talk to customers and when I listen to the problems that we hear about in the field—when we’re talking to prospects or helping customers improve—we have seen over and over again that bare metal automation, and this is why we specialize in bare metal, is such a foundational layer that if you can improve your process controls and governance at that layer, everything you build on top of it just becomes more effective, easier, faster, and more flexible.

I really do think that companies who look at self-managed, repatriated workloads can easily skip over the foundational layer, the bare metal infrastructure, and jump right to “I need VMware, I need OpenShift, I need Nutanix, I need a platform on top of all this stuff.” Some of those like Nutanix include hardware and lifecycle management, and you create an all-in platform, but you’re very tied into their ecosystem.

What we’re finding is, if you really can deliver best-in-class, bare metal automation that’s multi-vendor, that’s flexible and AI-driven, then you dramatically accelerate your time to execute projects, your flexibility on vendors, your ability to bring in new capabilities.

Something as simple—this is going to sound silly—but something as simple as staying current on the current version of a platform or within one version of their release, transforms the cost models for that infrastructure. It means you’re taking advantage of new features. It means you’re staying in support. It means you’re not fighting bugs that have been fixed. Companies get incredible benefits when they’re able to move more agilely towards the current versions and move forward.

The idea that if you have your own data center infrastructure and you’re going to be using end-of-life software and struggling with every patch and can’t keep it all updated—that’s what we were dealing with 10 years ago, and it’s very expensive, it’s very hard. Your infrastructure is fragile, but it doesn’t have to be like that.

We know that customers can have very fast-moving, very innovative teams, if they do things and take the time to build things right, and then again, OpenShift Virtualization, moving from VMware to OpenShift—those are very fast transitions. They don’t have to take years to plan, but you have to do it by building that foundational layer first, so that you can iterate more quickly at the top layer. If you do it from the top down, you’re going to find that the bottom layers are going to fight you all the time.

Swapnil Bhartiya: Rob, once again, thank you so much for joining me today and giving us an update on the summit and, of course, the Broadcom report survey. Thank you for the great insights. As usual, I look forward to chatting with you again.

Rob Hirschfeld: Thank you. It’s always a pleasure.

ControlTheory’s Bold Play: Reinventing Observability with Intelligent Control Planes

Previous article

Open Source at the Heart of AI: Why Kubernetes, PyTorch & LangChain Matter

Next article