Back to glossary

Statistical Power

The probability that an experiment will correctly detect a real effect when one exists, determined by sample size, effect size, and significance level. Higher power reduces the risk of missing genuine improvements.

Statistical power is the probability of rejecting a false null hypothesis, meaning the likelihood your experiment will detect a real difference if one exists. An experiment with 80% power has a 20% chance of missing a genuine effect and incorrectly concluding there is no difference. Power depends on three factors: sample size, minimum detectable effect size, and significance level.

For growth teams, understanding statistical power prevents the common mistake of running underpowered experiments that waste time and lead to inconclusive results. AI-powered experimentation platforms automate power analysis and sample size calculations, but growth engineers should understand the underlying trade-offs. Increasing power requires either more users, which means longer experiments, or detecting only larger effects, which means missing small but potentially valuable improvements. Teams should conduct pre-experiment power analysis to determine how long each test needs to run, establishing clear stopping criteria before launch. Running underpowered experiments is worse than not experimenting at all because false negatives lead to incorrect conclusions about what works. The practical recommendation is to target at least 80% power for all experiments and to design tests around the minimum effect size that would justify implementation.

Related Terms