Health Check
An endpoint or probe that reports whether a service instance is functioning correctly, used by load balancers, orchestrators, and monitoring systems to detect and route around unhealthy instances.
Health checks are the foundation of automated reliability. Load balancers use them to remove unhealthy instances from the traffic pool. Kubernetes uses liveness probes to restart crashed containers and readiness probes to control when instances receive traffic. Monitoring systems use health checks to trigger alerts and incident response.
A basic health check returns a 200 status code if the service is running. More sophisticated health checks verify that critical dependencies are accessible (database connections, external APIs, model files), that the service can perform its core function (execute a test query), and that resource utilization is within acceptable bounds (memory, disk, GPU).
For AI serving systems, health checks should verify that models are loaded and inference is functional. A model serving health check might execute a test prediction with known input and verify the output matches expectations. This catches issues like corrupted model files, GPU memory errors, or dependency failures that would cause the instance to return errors despite being technically running.
Related Terms
A/B Testing
A controlled experiment comparing two or more variants to determine which performs better on a defined metric, using statistical methods to ensure reliable results.
Feature Flag
A software mechanism that enables or disables features at runtime without deploying new code, used for gradual rollouts, A/B testing, and targeting specific user segments.
MLOps
The set of practices combining machine learning, DevOps, and data engineering to reliably deploy, monitor, and maintain ML models in production.
Model Serving
The infrastructure and systems that host trained ML models and handle inference requests in production, optimizing for latency, throughput, and cost.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords, typically powered by embedding-based similarity comparison.
CI/CD (Continuous Integration / Continuous Deployment)
An automated software practice where code changes are continuously integrated into a shared repository, tested, and deployed to production, reducing manual intervention and accelerating delivery cycles.