How to Calculate Linear Regression

Linear regression finds the best-fitting straight line through a set of data points. It's one of the most important tools in statistics and data science, used to predict outcomes, identify trends, and understand relationships between variables.

The goal is to find the line y = mx + b that minimizes the sum of squared vertical distances from each data point to the line.

The Formulas

Slope:

m = (nΣxy − ΣxΣy) / (nΣx² − (Σx)²)

Y-intercept:

b = (Σy − mΣx) / n

Step-by-Step Example

Data: (1,2), (2,4), (3,5), (4,4), (5,5)

xyxy
1221
2484
35159
441616
552525
Σ=15Σ=20Σ=66Σ=55

n = 5

m = (5×66 − 15×20) / (5×55 − 15²) = (330 − 300) / (275 − 225) = 30 / 50 = 0.6

b = (20 − 0.6×15) / 5 = (20 − 9) / 5 = 2.2

Regression line: y = 0.6x + 2.2

Interpreting the Results

  • Slope (m = 0.6): For each 1-unit increase in x, y increases by 0.6 on average
  • Intercept (b = 2.2): When x = 0, the predicted y is 2.2
  • R² (coefficient of determination): Tells you what percentage of variation in y is explained by x

Use our linear regression calculator for any dataset.