Back to glossary

Infrastructure as a Service

A cloud computing model that provides virtualized computing resources over the internet. IaaS offerings like AWS EC2, Google Compute Engine, and Azure Virtual Machines give teams full control over servers, storage, and networking without owning physical hardware.

IaaS provides the most control and flexibility of the cloud service models. Teams provision virtual machines, configure networking, attach storage volumes, and manage the full software stack from the operating system up. This control comes with corresponding responsibility for security patching, scaling, monitoring, and capacity planning.

For AI product teams, IaaS is often necessary for GPU-intensive workloads like model training, fine-tuning, and high-throughput inference. The ability to select specific GPU types, configure memory allocation, and optimize the software stack for model performance makes IaaS the right choice for computationally demanding AI operations. Growth teams generally interact with IaaS indirectly through the AI services that run on it, but understanding the cost structure is important: GPU instances are expensive, and inefficient utilization directly impacts the unit economics of AI features. Teams should implement auto-scaling policies, use spot instances for fault-tolerant workloads, and monitor GPU utilization to keep AI infrastructure costs aligned with the business value generated.

Related Terms

Content Delivery Network

A geographically distributed network of proxy servers that caches and delivers content from locations closest to end users. CDNs reduce latency, improve load times, and absorb traffic spikes by serving content from edge nodes rather than a single origin server.

Edge Computing

A distributed computing paradigm that processes data closer to the source of generation rather than in a centralized data center. Edge computing reduces latency, conserves bandwidth, and enables real-time processing for latency-sensitive applications.

Serverless Computing

A cloud execution model where the provider dynamically manages server allocation and scaling. Developers deploy functions or containers without provisioning infrastructure, paying only for actual compute time consumed rather than reserved capacity.

Function as a Service

A serverless computing category where developers deploy individual functions that execute in response to events. FaaS platforms like AWS Lambda, Google Cloud Functions, and Azure Functions handle all infrastructure management, scaling each function independently.

Platform as a Service

A cloud computing model that provides a complete development and deployment environment without managing underlying infrastructure. PaaS offerings like Heroku, Vercel, and Google App Engine handle servers, storage, networking, and runtime configuration.

Container Orchestration

The automated management of containerized applications across a cluster of machines, handling deployment, scaling, networking, and health monitoring. Kubernetes is the dominant orchestration platform, providing declarative configuration for complex distributed systems.