Variance measures how spread out a set of numbers is from their mean. It's one of the most important concepts in statistics — used in finance to measure investment risk, in science to assess experimental consistency, and in everyday data analysis.
What Is Variance?
Variance is the average of the squared differences from the mean. A low variance means the data points cluster tightly around the average. A high variance means they're widely spread.
There are two types:
- Population variance (σ²) — used when you have data for the entire population
- Sample variance (s²) — used when your data is a sample from a larger population
In practice, you'll almost always use sample variance.
The Variance Formula
Population Variance
σ² = Σ(xᵢ - μ)² / N
Where:
- xᵢ = each data point
- μ = the population mean
- N = number of data points
Sample Variance
s² = Σ(xᵢ - x̄)² / (n - 1)
Where:
- x̄ = the sample mean
- n - 1 = degrees of freedom (Bessel's correction)
The n - 1 in sample variance corrects for the fact that a sample tends to underestimate the true spread of the population.
Step-by-Step Example
Dataset: 4, 8, 6, 5, 3, 2, 8, 9, 2, 5
Step 1: Calculate the mean
Mean = (4 + 8 + 6 + 5 + 3 + 2 + 8 + 9 + 2 + 5) / 10
= 52 / 10
= 5.2
Step 2: Subtract the mean from each value and square the result
| Value | Value − Mean | (Value − Mean)² |
|---|---|---|
| 4 | 4 − 5.2 = −1.2 | 1.44 |
| 8 | 8 − 5.2 = 2.8 | 7.84 |
| 6 | 6 − 5.2 = 0.8 | 0.64 |
| 5 | 5 − 5.2 = −0.2 | 0.04 |
| 3 | 3 − 5.2 = −2.2 | 4.84 |
| 2 | 2 − 5.2 = −3.2 | 10.24 |
| 8 | 8 − 5.2 = 2.8 | 7.84 |
| 9 | 9 − 5.2 = 3.8 | 14.44 |
| 2 | 2 − 5.2 = −3.2 | 10.24 |
| 5 | 5 − 5.2 = −0.2 | 0.04 |
Step 3: Sum the squared differences
Σ(xᵢ − x̄)² = 1.44 + 7.84 + 0.64 + 0.04 + 4.84 + 10.24 + 7.84 + 14.44 + 10.24 + 0.04
= 57.6
Step 4: Divide by n − 1 (sample variance)
s² = 57.6 / (10 − 1) = 57.6 / 9 = 6.4
The sample variance is 6.4.
Variance vs Standard Deviation
Standard deviation is simply the square root of variance:
s = √s² = √6.4 ≈ 2.53
Standard deviation is expressed in the same units as the original data, making it easier to interpret. If your data is in kilograms, standard deviation is in kilograms. Variance is in kilograms². This is why standard deviation is more commonly reported — but variance is used in many statistical calculations.
Population vs Sample: When to Use Each
| Situation | Use |
|---|---|
| You have data for every member of the group | Population variance (÷ N) |
| Your data is a sample from a larger group | Sample variance (÷ n − 1) |
| Comparing to other statistical tests | Usually sample variance |
| Your dataset is the complete picture | Population variance |
When in doubt, use sample variance. Most real-world datasets are samples.
Why We Square the Differences
You might wonder: why not just average the raw differences from the mean?
The problem is that positive and negative deviations cancel out. For the dataset above, some values are above the mean and some are below. If you add them all up without squaring, you always get zero.
Squaring removes the negative signs, so all deviations contribute positively to the total spread.
Practical Applications
Finance: Portfolio variance measures investment risk. A portfolio with variance of 0.04 is less risky than one with variance of 0.16 — even if both have the same expected return.
Quality control: A manufacturing process with low variance produces more consistent output. High variance means unpredictable results.
Science: In experiments, high variance between repeated measurements suggests measurement error or uncontrolled variables.
Sports analytics: Player performance variance tells you whether a player is consistent (low variance) or streaky (high variance).
Common Mistakes
Using N instead of n − 1 for samples — This underestimates the true population variance. Always use n − 1 for sample data.
Forgetting to square — A common error is averaging the raw differences rather than the squared differences.
Confusing variance with range — Range is simply the maximum minus the minimum. Variance accounts for all data points, not just the extremes.
Quick Reference
| Formula | When to use |
|---|---|
σ² = Σ(xᵢ − μ)² / N | Full population |
s² = Σ(xᵢ − x̄)² / (n−1) | Sample from population |
s = √s² | To get standard deviation |