Service Mesh
A dedicated infrastructure layer that handles service-to-service communication in a microservices architecture, providing traffic management, security, and observability without changing application code.
A service mesh deploys a lightweight proxy (sidecar) alongside each service instance. All inter-service traffic flows through these proxies, which handle cross-cutting concerns transparently: mutual TLS encryption, load balancing, circuit breaking, retries, timeouts, and distributed tracing. The application code makes simple HTTP or gRPC calls; the mesh handles the operational complexity.
Popular service mesh implementations include Istio (feature-rich, Kubernetes-native), Linkerd (lightweight, simpler), and Consul Connect (HashiCorp ecosystem). They provide a control plane for configuring traffic policies and a data plane of sidecar proxies that enforce those policies at runtime.
For AI microservices architectures, a service mesh simplifies operations significantly. Traffic splitting for canary deployments of new models, mutual TLS between services without certificate management code, automatic retries with circuit breaking for flaky model APIs, and detailed latency metrics between services are all handled transparently. The trade-off is added resource overhead and operational complexity from managing the mesh itself.
Related Terms
A/B Testing
A controlled experiment comparing two or more variants to determine which performs better on a defined metric, using statistical methods to ensure reliable results.
Feature Flag
A software mechanism that enables or disables features at runtime without deploying new code, used for gradual rollouts, A/B testing, and targeting specific user segments.
MLOps
The set of practices combining machine learning, DevOps, and data engineering to reliably deploy, monitor, and maintain ML models in production.
Model Serving
The infrastructure and systems that host trained ML models and handle inference requests in production, optimizing for latency, throughput, and cost.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords, typically powered by embedding-based similarity comparison.
CI/CD (Continuous Integration / Continuous Deployment)
An automated software practice where code changes are continuously integrated into a shared repository, tested, and deployed to production, reducing manual intervention and accelerating delivery cycles.