With the increased adoption of cloud-native technologies, containers and Kubernetes have become the backbone of modern application deployments. Microservices-based container workloads are easier to scale, more portable, and resource-efficient. With Kubernetes managing these workloads, organizations can deploy advanced AI and machine learning applications across diverse compute resources, significantly improving operational productivity at scale. With this evolution of application architecture comes a strong need for built-in granular security controls and deep observability, however, the ephemeral nature of containers makes this challenging. That’s where Azure Advanced Container Networking Services comes in.
We’re excited to announce the General availability of Advanced Container Networking Services for Azure Kubernetes Services (AKS), a cloud-native purpose-built solution to enhance security and observability for Kubernetes and containerized environments. Advanced Container Networking Services focuses on delivering a seamless and integrated experience that allows you to maintain robust security postures and gain deep insights into your network traffic and application performance. This ensures that your containerized applications are not only secure but also meet your performance and reliability goals allowing you to confidently manage and scale your infrastructure.
Let’s take a look at the container network security and observability features of this release.
Container Network Observability
While Kubernetes excels in orchestrating and managing these workloads, one critical challenge remains: how do we gain meaningful visibility into how these services interact? Observing the network traffic of microservices, monitoring performance, and understanding dependencies between components are essential for ensuring both reliability and security. Without this level of insight, performance issues, outages, and even potential security risks can go undetected.
To truly understand how well your microservices are functioning, you need more than just basic cluster level metrics and virtual network logs. Comprehensive network observability requires granular network metrics including node-level, pod-level, and Domain Name Service (DNS)-level insights. These metrics allow teams to identify bottlenecks, troubleshoot issues, and monitor the health of each service in the cluster.
To address these challenges, Advanced Container Networking Services delivers powerful observability features tailored specifically for Kubernetes and containerized environments. Advanced Container Networking Services provides real-time and detailed insights across node-level, pod-level, and both Transmission Control Protocol (TCP) and DNS-level metrics ensuring that no aspect of your network goes unnoticed. These metrics are crucial in identifying performance bottlenecks and resolving network issues before they impact the workloads.
Advanced Container Networking Services network observability features include:
- Node-level metrics: These metrics provide insights into traffic volume, dropped packets, number of connections, etc. by node. The metrics are stored in Prometheus format and can be viewed in Grafana.
- Hubble metrics, DNS, and pod-level metrics: Advanced Container Networking Services uses Hubble to collect metrics and including Kubernetes context, such as source and destination pod name and namespace information, allowing network-related issues to be pinpointed at a more granular level. Metrics cover traffic volume, dropped packets, TCP resets, L4/L7 packet flows, and more. There are also DNS metrics, covering DNS errors and unanswered DNS requests.
- Hubble flow logs: Flow logs provide visibility into workload communication aiding in understanding how the microservices communicate with one another. Flow logs also help answer questions such as: did the server receive the client’s request? What is the round-trip latency between the client’s request and server’s response?
- Service dependency map: This traffic flow can also be visualized using Hubble UI, it creates a service-connection graph based on flow logs and displays flow logs for the selected namespace.
Container Network Security
One of the key challenges with container security stems from the fact that Kubernetes by default allows all communication between endpoints introducing high security risks. Advanced Container Networking Services with Azure CNI powered by Cilium enables advanced fine grained network policies using Kubernetes identities to only allow permitted traffic and secure endpoints.
While traditional network policies rely on IP-based rules for external traffic control, external services frequently change their IP addresses. This makes it difficult to enforce and ensure consistent security for workloads communicating beyond the cluster. With the Advanced Container Networking Services’ fully qualified domain name (FQDN) filtering and security agent DNS proxy, network policies can be insulated from IP address changes.
In the following section, we’ll dig deeper into how FQDN filtering can transform the way you secure Kubernetes networking.
FQDN filtering and security agent DNS proxy
The solution consists of two main components: the Cilium Agent and the security agent DNS proxy. Combined, they seamlessly integrate FQDN filtering into Kubernetes clusters allowing for more efficient and manageable control over external communications.
Cilium Agent
The Cilium Agent is a critical networking component that runs as a DaemonSet within clusters using Azure CNI powered by Cilium. The agent handles networking, load balancing, and network policies for pods in the cluster. For pods with enforced FQDN policies, the Cilium Agent redirects packets to the DNS Proxy for name resolution and updates the network policy using the FQDN:IP mappings obtained from the DNS Proxy.
Security Agent DNS Proxy
The DNS proxy that is part of the security agent runs as DaemonSet in Azure CNI powered by Cilium cluster with Advanced Container Networking services enabled. It handles DNS resolution for pods and on successful DNS resolution, it updates Cilium Agent with FQDN to IP mappings.
Running the security agent DNS proxy in a separate daemonset (acns-security-agent) alongside the Cilium agent ensures that pods continue to have DNS resolution even if the Cilium Agent is down or undergoing an upgrade. With the Kubernetes’ maxSurge upgrade feature the DNS proxy remains operational during upgrades. This design guarantees that network connectivity for essential customer workloads is not disrupted due to DNS resolution issues.
Customer adoption and scenarios
Advanced Container Networking Services was deployed by many internal and external customers even during its preview for the following use cases:
- Troubleshooting application degradation and DNS resolution timeouts using DNS errors and metrics.
- Applications and pods intermittently lose connectivity to other pods or external endpoints. Pod metrics show cluster admins dropped packet counts, TCP errors and retransmissions to help debug connectivity issues faster.
- Flow logs for debugging network connectivity issues.
- To enable cluster security and make policies more resilient in case of IP address changes, setting Cilium network policies using FQDNs instead of IP addresses greatly simplifies policy management.
At H&M Group, platform engineering is a core practice, supported by our cloud-native internal developer platform, which enables autonomous product teams to build and host microservices. Deep network observability and robust security are key to our success, and the Advanced Container Networking Service features help us achieve this. Real-time flow logs accelerate our ability to troubleshoot connectivity issues, while FQDN filtering ensures secure communication with trusted external domains.” — Magnus Welson, Engineering manager, container platform, H&M Group
The advanced observability offered by Advanced Container Networking Services helped us tremendously when we were investigating a high-impact problem in one of Japan Tobacco International AKS clusters. With the insights provided by Advanced Container Networking Services we were able to pinpoint the issue to DNS performance and then confirm that the remediation we applied was successful” — Andrew Wytyczak-Partyka, CEO CodeWave, Alexandru Popovici, DevOps & Security Manager, JT International
At Ferrovial, on our corporate Kubernetes platform (called Kubecore), we use the Advanced Container Networking Service to debug connectivity issues in our applications, using real-time network flow tools, bringing us full details. Additionally, DNS errors and metrics available at the workload level give us deep network visibility to troubleshoot application degradation faster.” — Victor Fernandez, Senior Cloud Architect, Ferrovial
Conclusion
As you continue your journey in the cloud-native space, the importance of integrating security and observability into every layer of your infrastructure cannot be overstated. With the right tools in place, you can move faster, innovate more, and do so with confidence that your workloads are both visible and protected.
Learn more about Advanced Container Networking Services in Azure
- Read more in the Advanced Container Networking Services documentation and try it out on your clusters today.
- Enable Container Network Observability with Prometheus and Grafana.
- Enable FQDN filtering with HA DNS Proxy.
- Learn more about Azure Kubernetes Service.
- Discover more about Azure CNI powered by Cilium. We would love to hear from you! Please take a minute and give us some feedback.