What is Kubernetes?

Learn how to run containers by automating deployment, management, and scaling.

Get started with Azure Azure Kubernetes Service

Kubernetes definition

Kubernetes is open-source software that automates the deployment, management, and scaling of containerized applications. It orchestrates clusters of virtual machines, schedules containers, and provides self-healing, load balancing, and portability across environments.

Key takeaways

Kubernetes is a platform that helps you run, control, and expand container workloads across multiple servers in a cluster.
It groups containers into Pods, with a control plane scheduling and controllers keeping desired state.
It handles service discovery, load balancing, autoscaling, and self-healing (restarts, replacements, and rescheduling).
Teams use it for portable releases across environments and to track health and stability metrics.

Kubernetes explained

Kubernetes is open-source software DevOps uses for deploying, managing, and scaling containers across a cluster. Docker—another open-source technology—is the default container file format commonly used in conjunction with Kubernetes.

Containers

Containers package an app with its dependencies and configuration so it can run consistently in different environments. As an application evolves, you might run many containers across many servers—and coordinating all of that by hand gets complicated. Kubernetes addresses that complexity with an open-source API and a control system that decides what should run where, and keeps it running.

What Kubernetes does for you

Scheduling and resource placement

Kubernetes orchestrates clusters of virtual machines and schedules containers to run based on available compute resources and the resource needs of each container.

Service discovery and load balancing

Kubernetes manages service discovery and load balancing so traffic can be routed to the right Pods, even as Pods change over time.

Scaling to match demand

Kubernetes scales workloads based on compute utilization and the desired state you set.

Self-healing when things fail

Kubernetes checks resource health and can restart or replicate containers when problems occur. It restarts or replaces failed containers and withholds traffic until workloads are ready.

Supports controlled changes

Kubernetes supports deployment patterns to running applications, including automated rollouts and rollbacks based on the desired state you describe.

Configuration and secrets management

Kubernetes provides secret and configuration management for sensitive information such as passwords or tokens.

Storage orchestration

Kubernetes can mount the storage systems you choose (for example, local storage or public cloud options), based on Kubernetes’ storage orchestration model.

A quick object glossary

Kubernetes is built around API objects—resources you declare and manage. Here are a few:

Pod: The basic unit that runs one or more containers.
Service: A stable way to expose an app and route traffic to changing Pods.
Deployment: A way to describe a desired app state and update it over time (often tied to rollouts and rollbacks).
Node: A machine in the cluster where Pods run.
Control plane: Components that manage cluster state and scheduling decisions.

How it all fits together

You declare the desired state for your application (what should run, how many copies, resource needs).
The control plane records and acts on that request through the Kubernetes API and backing store.
The scheduler places Pods on nodes with sufficient resources.
Node components run the workload.
Controllers continuously reconcile so actual state matches desired state—scaling, replacing, and updating workloads as needed.

Kubernetes architecture

The main building blocks

Cluster: Control plane plus worker nodes

A Kubernetes cluster is typically described in two parts: a control plane that manages the cluster, and worker nodes that run your workloads.

Control plane components

These components manage the overall state of the cluster:

kube-apiserver: Exposes the Kubernetes HTTP API (the front door for requests).
etcd: A consistent, highly available key-value store for cluster data.
kube-scheduler: Finds Pods that need placement and assigns them to a node.
kube-controller-manager: Runs controllers that reconcile actual state to the desired state.
cloud-controller-manager (optional): Connects Kubernetes to cloud-provider-specific control logic.

Node components

Each worker node runs software that maintains Pods and networking rules:

kubelet: Ensures Pods are running on the node, including their containers.
kube-proxy (optional): Maintains network rules to support services.
container runtime: Runs containers on the node.

Pods: The basic unit of work

Kubernetes doesn’t schedule individual containers directly in the abstract. It groups one or more containers into a Pod, which becomes the basic operational unit. Pods then scale up or down toward the desired state you define.

In practice, this means:

You describe the app you want running.
Kubernetes schedules Pods onto machines with available compute.
Controllers keep the cluster moving toward the state you asked for.

Service networking

Pods can come and go, so Kubernetes provides primitives—basic building blocks you combine to describe and run applications on a cluster—that offer stable ways to reach workloads.

Pod networking basics

Each Pod gets its own cluster-wide IP address, and Pods can communicate across nodes without network address translation (NAT) in the Kubernetes network model.

Services

The Service API provides a stable IP address or host name for a set of back-end Pods, even as individual Pods change over time.

EndpointSlice

Kubernetes maintains endpoint slice objects to track which Pods currently back a service.

Ingress/Gateway API

To expose services to clients outside the cluster, Kubernetes supports the Gateway API (and ingress as its predecessor).

LoadBalancer Services

Some environments can expose a service externally using the LoadBalancer service type.

Add-ons you may run in a cluster

Kubernetes clusters often include add-ons that extend functionality beyond the core components. Examples include:

DNS for cluster-wide name resolution.
Web UI (dashboard) for cluster management.
Container resource monitoring for metrics collection.
Cluster-level logging to collect logs centrally.

How does Kubernetes work?

Kubernetes runs containerized applications on a cluster of machines and keeps them in the state you describe. It does this by placing work on the right machines, routing traffic to the right places, and watching for failures and changes.

The basic flow

1. You describe what you want to run

Most Kubernetes workloads start as a declared “desired state” (what should be running, how many copies, and how they should be exposed). Kubernetes is built around declarative configuration and automation.

2. Kubernetes decides where it should run

Kubernetes schedules containers onto machines in the cluster based on available compute resources and what each container needs. Containers run inside Pods, which is the unit Kubernetes places on a machine.

3. Kubernetes keeps checking reality vs. your desired state

Controllers watch the cluster and work to move the current state closer to the desired state, using the API server to make changes.

Container scheduling and day-to-day management

Scheduling is the “where should this run?” decision.

1. Pods are scheduled, not individual containers

Kubernetes groups containers into Pods and then places those Pods on machines.

2. The scheduler assigns Pods to a suitable node

The kube-scheduler looks for Pods that aren’t assigned yet and selects a node for them.

3. Node agents keep the Pods running

On each node, kubelet makes sure the Pods are running (including their containers).

Load balancing and service discovery

Containers and Pods can be created, moved, or replaced, so applications need stable ways to find each other.

Service discovery and load balancing are built-in behaviors

Kubernetes manages service discovery and uses load balancing so traffic can be routed even as Pods change over time.

Services provide a stable address for a changing set of Pods

The Service API provides a stable IP address or host name for a service backed by one or more Pods, and Kubernetes tracks the backing Pods through EndpointSlice objects.

Traffic routing updates as Pods change

When Pods behind a service change, the service routing adapts so traffic continues to reach current back ends.

Scaling applications (and why “desired state” matters)

Kubernetes can scale workloads toward the state you set, including scaling based on compute utilization.

Common scaling ideas include:

More replicas (more Pods) to handle higher demand.
Fewer replicas when demand drops.
Resource tracking so placement decisions reflect CPU and memory needs.

This ties back to the “desired state” model: you specify the target, and controllers keep working toward it.

Self-healing: What happens when something breaks

Kubernetes includes self-healing behaviors that aim to maintain workload health and availability. These include:

Restarting failed containers (container-level restarts).
Replacing failed Pods to keep the requested number of replicas (replica replacement).
Rescheduling workloads when nodes become unavailable.
Removing failed Pods from service endpoints so traffic goes only to healthy Pods (load balancing for services).

Self-healing checks container health and restarts or replicates them when problems occur.

The role of Kubernetes KPIs

Key performance indicators (KPIs, or metrics) are used to understand cluster health and workload behavior.

Where KPIs come from

Kubernetes system components emit metrics (Prometheus format) that are useful for dashboards and alerts.
Metrics are typically available on a component’s /metrics HTTP endpoint, including components such as kube-apiserver, kube-scheduler, kubelet, kube-proxy, and kube-controller-manager.

Examples of what KPIs help you spot

Cluster health signals (component-level metrics and error patterns)
Workload stability (for example, frequent restarts or replacements)
Capacity pressure (resource allocation vs. demand, tied to scaling decisions)

Why this matters in day-to-day operations

Monitoring gives teams a more complete view of cluster resources, the Kubernetes API, containers, and logs, which shortens the feedback loop between issues and fixes.

Benefits and use cases

Kubernetes is often picked when teams need a consistent way to run many containers across many machines, while keeping traffic routing, scaling, and recovery handled by the platform.

Scalability

Kubernetes can scale workloads, meaning grow or shrink them as demand changes, toward a target state and can scale based on compute utilization.

Common use cases

Web apps with variable traffic (campaigns, seasonal peaks): Adjust replicas as load changes.
Batch or event-driven processing: Add capacity for bursts, then scale back.
Microservices: Scale specific services without scaling the whole application.

Resource efficiency

Kubernetes supports automatic bin packing, which is placing containers on nodes based on CPU and memory needs to make better use of available resources. It also tracks resource allocation as it manages workloads.

Common use cases

Shared clusters for multiple teams: Reduce wasted capacity by placing workloads where resources are available.
Mixed workload clusters (services plus jobs): Keep nodes busy without manual placement.

Service discovery and load balancing

Kubernetes can expose workloads using DNS or an IP address, and it can distribute network traffic to keep deployments stable. It also manages service discovery and incorporates load balancing as workloads change.

Common use cases

Microservices communication: Services find each other through stable names while Pods change.
Internal APIs behind a stable endpoint: Route traffic only to current backends.

Self-healing

Kubernetes is designed to replace failed containers, reschedule workloads when nodes become unavailable, and maintain the desired state. Examples include container restarts, replica replacement, and removing unhealthy Pods from service endpoints so traffic goes only to healthy Pods.

Common use cases

Always-on services: Restart or replace failed instances without manual intervention.
Clusters with frequent node churn: Reschedule workloads when a node goes down.
Services behind a service: Stop routing traffic to unhealthy Pods.

Safer releases

Kubernetes supports automated rollouts and rollbacks. You describe the desired state and Kubernetes moves the actual state toward it at a controlled rate.

Common use cases

Frequent application updates: Roll changes gradually, then revert if needed.
Teams shipping multiple services: Keep release mechanics consistent across apps.

Portability

Containerized apps are separate from infrastructure, and Kubernetes helps move them from local machines to production across on‑premises, serverless, hybrid cloud, and multicloud environments while keeping consistency across environments.

Common use cases

Dev/test/prod parity: Keep the same packaging and scheduling model across environments.
Hybrid deployments: Run parts of a system on-premises and parts in the cloud with the same orchestration approach.

Get started with Kubernetes

Kubernetes provides a common way to run containerized workloads with declarative configuration and machine-learning automation, backed by a large and growing set of tools and community support.

Why it still matters for cloud-native apps

As teams build systems made up of many services, short-lived workloads, and frequent updates, Kubernetes remains a practical foundation because it focuses on repeatable operations:

Consistent operations across environments (containers packaged with dependencies, run the same way across environments).
A broad, active ecosystem with widely available services, support, and tools.
Extensibility through community-built add-ons, plugins, and a conformance program that keeps core APIs consistent across versions.

If you’re building or running workloads, Azure Kubernetes Service (AKS) helps you deploy and manage containerized applications, with Azure managing the control plane.

Resources

Development

Azure resources

Explore the latest developer technology and discover new skills.

Training

Introduction to Kubernetes

This course guides you through the types of business problems you can solve by using Kubernetes.

Education

Azure for students

Gain skills to jump-start your career and make a positive impact on the world.

FAQ

Kubernetes is used to deploy, manage, and scale containerized apps across a cluster. It schedules workloads based on available compute, routes traffic with service discovery and load balancing, and helps keep apps running by restarting or replacing failed containers.
Kubernetes is typically used by developers and platform administrators running containerized applications. If you want less control-plane upkeep, a managed service, such as Azure Kubernetes Service (AKS), can handle the control plane while you focus on the nodes and apps.
Yes. Kubernetes can run containerized apps on local or on‑premises infrastructure, and it also supports hybrid and multicloud setups. Containers stay portable because the app is packaged separately from the underlying machines.
It can take time because there are several core concepts (clusters, nodes, Pods, services). Start with the basics, then practice by deploying a small app and follow a guided learning path to build confidence.

Get the Azure mobile app

What is Kubernetes?

Kubernetes definition

Key takeaways

Kubernetes explained

Containers

What Kubernetes does for you

Scheduling and resource placement

Service discovery and load balancing

Scaling to match demand

Self-healing when things fail

Supports controlled changes

Configuration and secrets management

Storage orchestration

A quick object glossary

How it all fits together

Kubernetes architecture

The main building blocks

Cluster: Control plane plus worker nodes

Control plane components

Node components

Pods: The basic unit of work

Service networking

Pod networking basics

Services

EndpointSlice

Ingress/Gateway API

LoadBalancer Services

Add-ons you may run in a cluster

How does Kubernetes work?

The basic flow

1. You describe what you want to run

2. Kubernetes decides where it should run

3. Kubernetes keeps checking reality vs. your desired state

Container scheduling and day-to-day management

1. Pods are scheduled, not individual containers

2. The scheduler assigns Pods to a suitable node

3. Node agents keep the Pods running

Load balancing and service discovery

Service discovery and load balancing are built-in behaviors

Services provide a stable address for a changing set of Pods

Traffic routing updates as Pods change

Scaling applications (and why “desired state” matters)

Self-healing: What happens when something breaks

The role of Kubernetes KPIs

Where KPIs come from

Examples of what KPIs help you spot

Why this matters in day-to-day operations

Benefits and use cases

Scalability

Common use cases

Resource efficiency

Common use cases

Service discovery and load balancing

Common use cases

Self-healing

Common use cases

Safer releases

Common use cases

Portability

Common use cases

Get started with Kubernetes

Why it still matters for cloud-native apps

Azure resources

Introduction to Kubernetes

Azure for students

Frequently asked questions

Why is Kubernetes used?

Who would typically use Kubernetes?

Can Kubernetes be used without the cloud?

Is Kubernetes hard to learn?