Datadog Boosts Network Observability with eBPF, Cuts CPU Usage by 35%

0

Datadog has significantly improved the performance and accuracy of its Cloud Network Monitoring (CNM) by leveraging eBPF, eliminating previous limitations caused by legacy Netlink and API polling methods. The move resulted in a 35% reduction in CPU usage and more accurate attribution of network connections to containers and services.

Earlier, CNM relied on Netlink and container runtime APIs to track NAT and container data, which led to data loss and performance bottlenecks—especially on hosts with high connection churn. These challenges were addressed by implementing eBPF-based kprobes and process event streams, enabling real-time tracking of network and process data.

By hooking into connection tracking table insertions and caching process-to-container relationships in user space, Datadog eliminated the need for throttling updates and reduced missed attributions. This enhanced visibility into short-lived processes and improved query reliability for end users.

While real-time eBPF tracking slightly increased CPU usage, Datadog deemed the tradeoff acceptable for the accuracy gained. The company plans to expand eBPF use further, enabling deeper correlation of connection data and advanced observability features in future releases.

0

EGGER Achieves 99.99% Uptime with SIOS LifeKeeper for Linux

Previous article

How Qualys Is Turning Cybersecurity Into a Boardroom Conversation with ROC and mROC

Next article