Precision and Recall
Complementary classification metrics where precision measures the fraction of positive predictions that are correct, and recall measures the fraction of actual positives that are detected.
Precision and recall capture different types of errors in classification. Precision answers "Of all the items I flagged as positive, how many actually are?" while recall answers "Of all the truly positive items, how many did I find?" A spam filter with 99% precision rarely marks legitimate email as spam, but if recall is 50%, half the spam gets through.
The tension between precision and recall is fundamental. Increasing one typically decreases the other. Lowering the classification threshold catches more true positives (higher recall) but also more false positives (lower precision). The right balance depends entirely on the business context: fraud detection prioritizes recall (missing a fraud is costly), while content recommendation prioritizes precision (showing irrelevant content hurts engagement).
For growth teams using ML models, choosing between precision and recall has direct business impact. A churn prediction model with high recall catches nearly every at-risk customer but may waste outreach resources on false alarms. A lead scoring model with high precision ensures sales teams only contact likely converters but may miss some viable leads. The optimal trade-off is determined by the relative costs of false positives versus false negatives in your specific application.
Related Terms
RAG (Retrieval-Augmented Generation)
A technique that grounds LLM responses in external data by retrieving relevant documents at query time and injecting them into the prompt context.
Embeddings
Dense vector representations of text, images, or other data that capture semantic meaning in a high-dimensional space, enabling similarity search and clustering.
Vector Database
A specialized database optimized for storing, indexing, and querying high-dimensional vector embeddings with sub-millisecond similarity search.
LLM (Large Language Model)
A neural network trained on massive text corpora that can generate, understand, and transform natural language for tasks like summarization, classification, and conversation.
Fine-Tuning
The process of further training a pre-trained LLM on a domain-specific dataset to specialize its behavior, style, or knowledge for a particular task.
Prompt Engineering
The practice of designing and iterating on LLM input instructions to reliably produce desired outputs for a specific task.