Back to glossary

Kubernetes

An open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications across clusters of machines.

Kubernetes (K8s) provides a declarative way to run and manage containerized workloads. You describe the desired state of your application (how many replicas, resource limits, networking rules) and Kubernetes continuously works to maintain that state, automatically restarting crashed containers, scaling replicas, and distributing workloads across nodes.

Core Kubernetes concepts include Pods (the smallest deployable unit), Deployments (managing replica sets), Services (stable networking endpoints), ConfigMaps and Secrets (configuration management), and Ingress (external traffic routing). The platform handles service discovery, load balancing, rolling updates, and self-healing out of the box.

While Kubernetes adds operational complexity, it excels for teams running diverse workloads at scale. AI teams benefit from Kubernetes' ability to manage heterogeneous infrastructure, scheduling GPU workloads for model training alongside CPU workloads for API serving. Managed Kubernetes services like EKS, GKE, and AKS reduce the operational burden while retaining the platform's flexibility.

Related Terms