The Core Concept: Centralized architectures, siloed operations, and request-response traffic models are structurally incompatible with the real-time, machine-driven demands of AI in 2026.
The Guest: Danielle Cook, Senior Manager at Akamai and CNCF Ambassador
The Bottom Line:
- Centralized architectures will visibly fail under real-time AI at scale — latency and round-trip distance will surface directly in customer experience, just as they did in the early web.
- AI operationally unifies security, observability, GPU management, and data pipelines into a single, interconnected problem — teams that try to address each challenge in isolation will fall behind.
- Agentic, machine-driven interaction loops invalidate traditional request-response architecture assumptions, demanding a fundamental rethink of how AI traffic is modeled and served.
Speaking with TFiR, Danielle Cook of Akamai defined the current state of enterprise AI infrastructure and outlined the three structural bottlenecks organizations must address before they become critical failures in 2026.
WHAT ARE THE BIGGEST CHALLENGES ORGANIZATIONS WILL FACE RUNNING AI ON KUBERNETES IN 2026?
Cook frames the first challenge as an architectural mismatch with deep historical precedent. Centralized architectures — compute concentrated in a handful of regions or a single data center — will struggle to support real-time AI interactions at scale. “We’re going to see what we saw with the worldwide web. Distance and round trips are going to show up immediately in that customer experience.” The remedy is distribution: moving execution closer to users and data, not as an optimization, but as a foundational design assumption.
The Day-Two Operational Stack Problem
The second challenge is operational convergence. Cook explains that AI doesn’t just introduce new workloads — it compresses the entire day-two operations stack into a single, unified problem. Security, observability, data pipelines, and GPU management — previously addressable as separate concerns — now interlock. “Teams are going to have to solve that in one place. They can’t just address each issue singularly.” Organizations that continue to apply siloed remediation to individual components will find that fixing one element creates instability elsewhere.
Agentic Systems and the Death of Request-Response
The third challenge is architectural and behavioral. Agentic AI systems generate continuous, machine-driven interaction loops — a traffic pattern that traditional request-response architectures were never designed to handle. “The traditional model of request-response architectures — what people are waiting to do, wait for it — it’s not going to work in this new agentic system.” This is not a performance tuning problem. It is a fundamental mismatch between how existing systems process interactions and how agentic AI actually operates.
Broader Context: Predictions, Opportunities, and Akamai’s 2026 Strategy
These bottlenecks do not exist in isolation — they are the operational counterpart to the architectural shifts Cook outlined across her full TFiR interview. Her 2026 predictions establish that AI inference placement is becoming a primary design choice, and that the distributed cloud will become the default application architecture. The challenges she identifies here are precisely what happens when organizations attempt to run that distributed, inference-heavy future on infrastructure designed for a centralized, request-response past.
On the opportunity side, Cook points to three openings: unlocking distributed cloud as the default architecture for AI workloads, enabling real-time personalization at the edge (already visible in retail and travel), and making Kubernetes operationally invisible through opinionated platforms and internal developer platforms (IDPs). “Your teams can just be deploying AI anywhere.” Akamai’s execution against this includes its managed Kubernetes engine (LKE), GPU-backed infrastructure, and distributed edge — the combination designed to let any AI model run wherever users are, without the operational drag these three bottlenecks typically introduce.
Watch the full TFiR interview with Danielle Cook here.





