Bayesian Inference
A statistical framework that updates probability estimates as new evidence becomes available, combining prior beliefs with observed data to produce posterior probability distributions over hypotheses.
Bayesian inference starts with a prior distribution representing your belief before seeing data, then updates this belief using observed data through Bayes' theorem to produce a posterior distribution. Unlike frequentist methods that provide point estimates and p-values, Bayesian methods provide full probability distributions that answer questions like "what is the probability that variant B is better than variant A?"
The Bayesian approach is particularly natural for A/B testing. Instead of waiting for a fixed sample size and then declaring significance, Bayesian methods continuously update the probability that each variant is best. You can check results at any time without inflating error rates, and the output is intuitive: "there is a 94% probability that variant B improves conversion by 2-5%."
For growth teams, Bayesian methods offer practical advantages. They handle small sample sizes better than frequentist methods, provide probabilistic answers that align with business decision-making, and support more flexible experimental designs. Platforms like Optimizely and VWO now offer Bayesian analysis alongside traditional frequentist statistics.
Related Terms
Cosine Similarity
A measure of similarity between two vectors based on the cosine of the angle between them, ranging from -1 (opposite) to 1 (identical), commonly used to compare embeddings.
Dimensionality Reduction
Techniques that reduce the number of dimensions in high-dimensional data while preserving meaningful structure, used for visualization, compression, and noise removal.
Batch Inference
Processing multiple ML predictions as a group at scheduled intervals rather than one-at-a-time on demand, optimizing for throughput and cost over latency.
Real-Time Inference
Generating ML predictions on-demand as requests arrive, typically with latency requirements under 200ms for user-facing features.
Data Pipeline
An automated sequence of data processing steps that moves data from source systems through transformations to destination systems, enabling reliable and repeatable data flows across an organization.
ETL (Extract, Transform, Load)
A data integration pattern that extracts data from source systems, transforms it into a structured format suitable for analysis, and loads it into a target data warehouse or database.