AI Infrastructure

How Akamai and NVIDIA Are Bringing Real-Time AI Inference to the Edge | Ari Weil, Akamai

0

Guest: Ari Weil (LinkedIn)
Company: Akamai
Show Name: An Eye on AI
Topic: Edge Computing

AI is evolving at lightning speed — but traditional cloud architectures are struggling to keep up. The next leap in intelligence won’t happen in centralized data centers; it will happen at the edge. From live sports analytics to personalized shopping and real-time fraud detection, the need for local, distributed AI is reshaping enterprise computing. At NVIDIA GTC, Akamai announced its Inference Cloud — a platform designed to move AI inference closer to data, users, and devices.

For more than two decades, Akamai has powered the distributed internet — optimizing performance, content delivery, and security at global scale. Now, it’s taking that expertise into the world of artificial intelligence. Speaking live from NVIDIA GTC, Ari Weil, VP of Product Marketing at Akamai, explained how the new Akamai Inference Cloud builds on the company’s global edge network and NVIDIA’s Blackwell AI infrastructure to redefine how AI inference happens.

“The biggest change we’ve seen is that content is no longer static,” Weil said. “It’s now generated by compute in real time. We’re extending the intelligence built in centralized AI factories and moving it closer to where machines and users interact with data.”

At the core of this transformation is proximity — reducing latency and cost while increasing intelligence and responsiveness. With inference distributed across Akamai’s global network, enterprises can process data closer to its source, enabling a new class of real-time experiences.

From Content Delivery to Inference Delivery

Akamai’s evolution mirrors the evolution of the internet itself. What started as a content delivery network is now becoming an inference delivery network. The same distributed architecture that once streamed 8K video or secured APIs can now power AI agents and machine learning workloads.

Weil emphasized how Akamai’s edge locations, combined with NVIDIA’s GPUs, enable enterprises to run inference at unprecedented scale and speed. “We’re deploying dense computing so we can ready our application network to become a generative network — powering real-time content generation and distribution,” he said.

New AI Experiences Across Industries

The implications go far beyond media. “In the media industry, this means real-time video intelligence — identifying anomalies or highlights instantly and creating derivative content in seconds,” Weil explained. “In commerce, it’s about understanding users in real time and delivering personalized experiences without delay. In finance, it means instant fraud detection or loan approvals.”

These use cases share one need — immediacy. As AI becomes more agentic and autonomous, every millisecond matters. Bringing inference closer to the user isn’t just a performance gain; it’s a requirement for next-generation digital experiences.

Balancing Cost, Performance, and Security

Enterprises face a familiar dilemma: centralized clouds are powerful but expensive, and edge environments are fast but fragmented. Akamai’s Inference Cloud bridges that gap through intelligent model routing and decision orchestration.

“We can direct workloads based on latency, cost, or security,” Weil explained. “If it’s cheaper to process locally, we’ll do that. If we can cache results or route to a lower-cost model, we’ll make that choice automatically. The goal is to give enterprises the best trade-off for each use case.”

The platform leverages Akamai’s open-source-based architecture, providing true multi-cloud portability. That’s critical for enterprises wary of lock-in as AI expands beyond any single hyperscaler’s ecosystem. Akamai’s approach allows them to deploy models wherever it makes sense — at the edge, in data centers, or across clouds.

Developer Enablement and Open Ecosystem

One of the most compelling aspects of the launch is Akamai’s commitment to developers. Built on the Linode Kubernetes Engine, the company’s application platform allows developers to create turnkey AI applications quickly and cost-effectively.

“At KubeCon North America, we’ll be showing how developers can use our app platform for AI workloads without worrying about scalability or lock-in,” Weil said. “It’s built on open source and Kubernetes, which means multi-cloud flexibility and faster time to market.”

Akamai’s partnerships within the open source and CNCF ecosystems reinforce that vision. From data fabric integrations with VAST Data to managed database and object storage offerings, the company is building the connective tissue for a distributed, AI-driven web.

The Future: Agentic and Machine-to-Machine AI

As enterprises move from centralized AI factories to edge inference, the next frontier is machine-to-machine interaction — where agents autonomously generate, process, and act on data in real time.

“We’re preparing for a world where inference isn’t just about response time,” Weil noted. “It’s about creating a generative network — where applications and machines interact intelligently, continuously, and globally.”

With thousands of edge locations worldwide and deep AI partnerships, Akamai aims to make that future practical. Its Inference Cloud could become the backbone for the emerging generation of distributed, intelligent systems.

Watch the full conversation with Ari Weil, VP of Product Marketing at Akamai → [YouTube]

Why SLOs Fail Without Oversight — Brian Singer, Nobl9

Previous article

RackN’s Automation Cuts AI Cluster Reset Time from a Week to 90 Minutes | Rob Hirschfeld

Next article