Gradient Boosting
An ensemble technique that builds models sequentially, with each new model specifically trained to correct the errors of the previous ones, producing a powerful combined predictor through iterative refinement.
Gradient boosting constructs an additive model in stages. The first model makes predictions, and subsequent models are trained on the residual errors (what the current ensemble gets wrong). Each new model focuses on the hardest examples, gradually chipping away at the remaining error. The final prediction is the sum of all models' contributions, weighted by a learning rate that controls each model's impact.
The dominant implementations are XGBoost, LightGBM, and CatBoost. XGBoost popularized regularized boosting and remains widely used. LightGBM introduced histogram-based splitting for faster training on large datasets. CatBoost handles categorical features natively and reduces overfitting through ordered boosting. All three consistently top leaderboards for tabular data problems.
For production ML on structured data, gradient boosting is the go-to algorithm. It achieves state-of-the-art performance on most tabular datasets, trains efficiently on CPUs (no GPU required), produces interpretable models through feature importance and SHAP values, and integrates easily with production pipelines. For growth applications like propensity modeling, customer scoring, and demand forecasting, gradient boosting typically outperforms both simpler methods and deep learning on structured data.
Related Terms
RAG (Retrieval-Augmented Generation)
A technique that grounds LLM responses in external data by retrieving relevant documents at query time and injecting them into the prompt context.
Embeddings
Dense vector representations of text, images, or other data that capture semantic meaning in a high-dimensional space, enabling similarity search and clustering.
Vector Database
A specialized database optimized for storing, indexing, and querying high-dimensional vector embeddings with sub-millisecond similarity search.
LLM (Large Language Model)
A neural network trained on massive text corpora that can generate, understand, and transform natural language for tasks like summarization, classification, and conversation.
Fine-Tuning
The process of further training a pre-trained LLM on a domain-specific dataset to specialize its behavior, style, or knowledge for a particular task.
Prompt Engineering
The practice of designing and iterating on LLM input instructions to reliably produce desired outputs for a specific task.