Foundation Model
A large-scale AI model trained on broad data that can be adapted to a wide range of downstream tasks through fine-tuning or prompting, serving as a general-purpose base for specialized applications.
Foundation models represent a paradigm shift in AI development. Instead of training a separate model for each task, you train one massive model on diverse data and then adapt it. GPT-4, Claude, Gemini, and Llama are foundation models for language. CLIP and DALL-E are foundation models for vision. These models encode general knowledge that transfers to specific applications.
The term was coined by Stanford's Center for Research on Foundation Models to emphasize both the power and the risks of this approach. The power is efficiency: a single foundation model can be adapted to thousands of tasks with minimal additional training. The risk is concentration: biases, errors, or vulnerabilities in the foundation model propagate to every downstream application built on it.
For product teams, foundation models are the starting point for most AI features today. The strategic decisions involve which foundation model to build on (balancing capability, cost, and vendor lock-in), how much to customize it (prompting versus fine-tuning), and how to mitigate the risks of depending on a model you do not control. Multi-model architectures that use different foundation models for different tasks reduce single-provider dependency.
Related Terms
RAG (Retrieval-Augmented Generation)
A technique that grounds LLM responses in external data by retrieving relevant documents at query time and injecting them into the prompt context.
Embeddings
Dense vector representations of text, images, or other data that capture semantic meaning in a high-dimensional space, enabling similarity search and clustering.
Vector Database
A specialized database optimized for storing, indexing, and querying high-dimensional vector embeddings with sub-millisecond similarity search.
LLM (Large Language Model)
A neural network trained on massive text corpora that can generate, understand, and transform natural language for tasks like summarization, classification, and conversation.
Fine-Tuning
The process of further training a pre-trained LLM on a domain-specific dataset to specialize its behavior, style, or knowledge for a particular task.
Prompt Engineering
The practice of designing and iterating on LLM input instructions to reliably produce desired outputs for a specific task.