Tetrate, Bloomberg join forces to develop open source AI gateway for enterprise integration

0

Engineers from Tetrate and Bloomberg have joined forces to develop a community-led set of core AI gateway features for enterprise AI integration. This effort will expand the capabilities of the CNCF’s Envoy Gateway project, one of the Kubernetes Gateway API implementations.

“Historically, when shared problems arise in the software industry, the open source community rallies to solve them, accelerating innovation,” said Varun Talwar, Founder of Tetrate. “Our collaboration with Bloomberg and the CNCF aims to achieve precisely that: designing and delivering a community-led, fully open source AI gateway, powered by the leading contender to replace legacy models for Kubernetes ingress. It’s a solution the market is asking for, and we’re excited to be part of the team of maintainers and contributors creating it.”

AI Gateways enable organizations to integrate AI functionality into workflows and applications. They route requests to multiple AI service providers and models through a single reverse proxy layer (often referred to as a gateway). AI Gateways simplify AI integration by providing a single unified API layer with which developers interact, and can provide additional functionality, such as rate limiting, caching and observability.

The initial idea for this project arose when Dan Sun, Engineering Team Lead for Bloomberg’s Cloud Native Compute Services – AI Inference team and co-founder/maintainer of the KServe project, came to the Envoy community and outlined his views of the problem space and a potential path forward for solving it. Tetrate, a major upstream contributor to the Envoy project, stepped forward to express interest in helping Sun and Bloomberg turn their vision for the Envoy AI Gateway API into reality.

Envoy Gateway and KServe can be used together to allow traffic routing to both self-hosted and vendor-hosted LLMs. In this case, the AI gateway sits on the top and routes open source LLM model traffic to self-hosted endpoints using KServe, and vendor-hosted model traffic is routed to AWS Bedrock or other, similar cloud-based services.

The first features to be included in Envoy AI Gateway will provide:

  • application traffic management to LLM providers with high-availability routing strategies;
  • LLM usage monitoring and control at the application, organization, and enterprise levels, to help users manage costs; and
  • a unified interface for LLM requests through which the gateway handles back-end connectivity to various LLM providers.

The open source Envoy Gateway extensions and enhancements will offer usage control for applications that are integrated with multiple LLM providers and models, robust authorization mechanisms, and intelligent fallback options to ensure continued operation even when cloud providers are unavailable or too expensive.

“The Envoy project continues to impress with its flexibility to support new and valuable use cases,” said Chris Aniszczyk, CTO of the CNCF. “Bloomberg and Tetrate have done exactly what our community is designed to do: bring people and organizations together to solve a common problem. That they’re doing it with Envoy Gateway only validates the power and extensibility of the project.”

Join the “Enabling AI Adoption at Scale – The AI Platform with Envoy AI Gateway” webinar, an online panel discussion of Bloomberg and Tetrate engineers hosted by the CNCF on Thursday, October 17, 2024 at 1 PM EDT. Erica Hughberg of Tetrate will be joined by Dan Sun and Yuzhui Liu of Bloomberg and other contributors from Tetrate and the CNCF community for a conversation about the project and ways for others to get involved and contribute.

Apache Cassandra 5.0 powers next-gen applications with AI capabilities

Previous article

Troubleshooting with AI – How k8sgpt makes debugging Kubernetes clusters easier

Next article