Spot Instances
Cloud compute instances available at steep discounts, typically 60-90% off on-demand pricing, in exchange for the possibility that the cloud provider can reclaim them with short notice when capacity is needed. Spot instances are ideal for fault-tolerant and flexible workloads.
Spot instances leverage unused cloud capacity that providers sell at variable market-based pricing. The trade-off is clear: massive cost savings in exchange for accepting that instances can be interrupted with as little as two minutes notice. Workloads that use spot instances must be designed to handle interruptions gracefully through checkpointing, distributed processing, and automatic re-scheduling.
For AI product teams, spot instances are a game-changer for training workloads. Model training is inherently parallelizable and can checkpoint progress, making it ideal for spot instances. A training job that costs $10,000 on-demand might cost $2,000 on spot instances with only modest additional time from occasional interruptions. Growth teams benefit indirectly through reduced AI infrastructure costs that make it feasible to run more experiments and train more model variants. Spot instances are not suitable for user-facing inference because interruptions cause downtime, but they work well for batch inference jobs, data preprocessing pipelines, and model evaluation tasks where brief delays are acceptable.
Related Terms
Content Delivery Network
A geographically distributed network of proxy servers that caches and delivers content from locations closest to end users. CDNs reduce latency, improve load times, and absorb traffic spikes by serving content from edge nodes rather than a single origin server.
Edge Computing
A distributed computing paradigm that processes data closer to the source of generation rather than in a centralized data center. Edge computing reduces latency, conserves bandwidth, and enables real-time processing for latency-sensitive applications.
Serverless Computing
A cloud execution model where the provider dynamically manages server allocation and scaling. Developers deploy functions or containers without provisioning infrastructure, paying only for actual compute time consumed rather than reserved capacity.
Function as a Service
A serverless computing category where developers deploy individual functions that execute in response to events. FaaS platforms like AWS Lambda, Google Cloud Functions, and Azure Functions handle all infrastructure management, scaling each function independently.
Platform as a Service
A cloud computing model that provides a complete development and deployment environment without managing underlying infrastructure. PaaS offerings like Heroku, Vercel, and Google App Engine handle servers, storage, networking, and runtime configuration.
Infrastructure as a Service
A cloud computing model that provides virtualized computing resources over the internet. IaaS offerings like AWS EC2, Google Compute Engine, and Azure Virtual Machines give teams full control over servers, storage, and networking without owning physical hardware.