Guest: Yrieix Garnier
Company: Datadog
Show: The Agentic Enterprise
Topic: Observability
Cloud operations teams are bleeding money on two fronts: storage costs spiraling out of control and incident response consuming hours of engineering time. Yrieix Garnier, VP of Product at Datadog, believes autonomous agents can solve both problems without creating new blind spots. In this conversation, he breaks down Datadog’s newly launched Storage Management and Bits AI capabilities—tools designed to optimize cloud spend and investigate production incidents autonomously.
Storage Management: Finding Waste Before It Compounds
Datadog’s Storage Management capability addresses a problem many enterprises face but few actively monitor—cloud storage waste. With AI workloads driving storage consumption through the roof, organizations often pay for terabytes of redundant or forgotten data they never access.
“What we see from observability is we have a lot of metrics and data telemetry about cloud environment usage. Specifically on storage, we’ve seen a lot of waste,” says Yrieix.
The solution allows teams to drill deep into storage buckets, identify cost drivers, and understand usage patterns. It’s not just about visibility—Datadog provides actionable recommendations, suggesting when to move data from hot storage to cheaper cold storage tiers. The goal is proactive optimization, not reactive firefighting.
“You can optimize, reduce cost, and reinvest in other areas. That’s the real value,” Yrieix explains.
For teams running on AWS S3, with Google Cloud Storage and Azure Blob Storage in preview, Storage Management tracks monthly spend and flags anomalies in usage patterns. The result: hard dollar savings and better resource allocation.
Bits AI: Autonomous Incident Investigation with Transparent Reasoning
When a production incident hits at 2am, every second counts. Datadog’s Bits AI SRE agent changes the game by autonomously investigating incidents before engineers even open their laptops.
Here’s how it works: when a Datadog monitor triggers an alert, Bits AI automatically launches an investigation. It doesn’t wait for an SRE to ask. Using the full stack of Datadog’s observability data—logs, metrics, traces, network telemetry, code changes, even Confluence docs and GitHub repos—Bits AI builds multiple hypotheses about the root cause.
“It starts to think like an SRE,” says Yrieix. “It builds hypotheses, looks at dependencies, and checks all of them in minutes. What takes a team hours or days, Bits AI does in minutes.”
But speed alone isn’t enough. The agent exposes its reasoning through a tree view, showing every hypothesis it tested, every data point it analyzed, and why it arrived at a specific conclusion. This transparency is critical for building trust.
“We don’t keep it as a black box. You see the reasoning, the evidence, the observations. You understand what’s been done and why the conclusion makes sense,” Yrieix notes.
Once Bits AI surfaces the root cause, it can even generate a code fix and create a pull request—though the final deployment decision remains with the engineering team. This human-in-the-loop design ensures control without sacrificing speed.
From Observability to Autonomous Operations
Datadog’s vision extends beyond investigation. Yrieix sees three phases in autonomous operations: detection, investigation, and remediation. Datadog already excels at detection through its monitoring capabilities. Bits AI tackles investigation. The next frontier is automated remediation—agents not just finding the problem and suggesting a fix, but deploying it autonomously.
“That last phase requires an even higher level of trust,” Yrieix says. “But you’ll see more incidents resolved by agentic systems in the future. SREs can focus on architecture and system design while agents handle the operational toil.”
Datadog is also expanding its agent portfolio. Beyond Bits AI SRE, the company is building a code agent that analyzes application code and a security agent that investigates vulnerabilities. These agents will work together, communicating across the platform to deliver end-to-end automation.
Adoption Without Disruption
One concern for platform teams is whether adopting these tools requires ripping and replacing existing workflows. Yrieix is clear: it doesn’t.
“These tools don’t require any changes. We’re building them on your already existing observability practice. You just enable AI to help in investigations. No time, no effort to access it,” he says.
For teams already using Datadog, the activation path is simple. For those evaluating the platform, both Storage Management and Bits AI work out of the box with existing telemetry.
Results That Matter
Early customer feedback shows real impact. Incidents that previously required multiple engineers working for hours are now resolved in minutes. On the storage side, teams see month-over-month cost reductions as they optimize bucket usage and tier data appropriately.
“You reduce a team of multiple people spending hours to just a few minutes. Your system is up and running faster, which directly impacts your business metrics—transactions, checkouts, whatever you measure,” Yrieix explains.
The business case is clear: faster incident resolution means less downtime, fewer revenue losses, and happier customers. Storage optimization means lower cloud bills and smarter resource allocation.
What’s Next
Datadog is doubling down on both agents and optimization. The company is building more autonomous agents for code analysis and security investigations. It’s also expanding its Cloud Cost Management platform with AI-powered recommendations across compute, Kubernetes, and other cloud resources—not just storage.
“The transformation to agents will be in all the different components of Datadog,” says Yrieix. “We’re taking multiple approaches and seeing where the best value is for customers.”
For SRE and platform teams drowning in toil, the message is simple: autonomous operations are here, and they’re real. The question isn’t whether AI will reshape these workflows—it’s how fast teams can adopt the tools to stay ahead.





