Rolling Deployment
A deployment strategy that gradually replaces instances of the previous application version with the new version, maintaining availability by ensuring some instances are always running throughout the process.
Rolling deployments update application instances incrementally. If you have 10 instances running version 1, a rolling deployment might update 2 at a time: take 2 instances out of the load balancer, update them to version 2, verify they are healthy, add them back, then repeat for the next batch. This maintains capacity throughout the deployment.
The key parameters are batch size (how many instances update simultaneously) and health check criteria (what conditions must be met before proceeding). Conservative settings (small batches, strict health checks) are slower but safer. Aggressive settings (large batches, minimal checks) are faster but riskier.
Rolling deployments are the default strategy in Kubernetes and most container orchestration platforms. For AI model deployments, rolling updates allow gradual transition to a new model version while maintaining serving capacity. However, during the rollout window, both old and new model versions serve traffic simultaneously, which can be a concern if the versions produce meaningfully different outputs. Canary releases offer more control when model version consistency matters.
Related Terms
A/B Testing
A controlled experiment comparing two or more variants to determine which performs better on a defined metric, using statistical methods to ensure reliable results.
Feature Flag
A software mechanism that enables or disables features at runtime without deploying new code, used for gradual rollouts, A/B testing, and targeting specific user segments.
MLOps
The set of practices combining machine learning, DevOps, and data engineering to reliably deploy, monitor, and maintain ML models in production.
Model Serving
The infrastructure and systems that host trained ML models and handle inference requests in production, optimizing for latency, throughput, and cost.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords, typically powered by embedding-based similarity comparison.
CI/CD (Continuous Integration / Continuous Deployment)
An automated software practice where code changes are continuously integrated into a shared repository, tested, and deployed to production, reducing manual intervention and accelerating delivery cycles.