You use a linear model when your points lie on an approximately straight line. Each set of points has its own unique best-fit model. Because there’s an infinite amount of ways to create a collection of points that can be modeled as a linear expression, there’s an infinite number of graphs that fit the expression below. The only difference between these graphs is the values of the slope $a$ and the constant term $b$.
Theory
A linear function (straight line) is written like this:
$$f(x)=ax+b$$ |
In this expression, $a$ is the slope and $b$ is where the graph intersects the $y$-axis.
When something increases or decreases by the same amount all the time, you have linear growth.
Linear regression is regression where you want to find the straight line $f(x)=ax+b$ that best fits a set of points. You will use digital tools for this. A plot for a linear regression will look like this:
For linear regression, you use the correlation coefficient $r$ as a measure of how well the function fits the points. The value $r$ varies from $-1$ to $1$, where
$r=1$:
Perfectly adapted to the points, and the function increases as $x$ increases.
$r=0$:
No correlation. The variables are linearly independent.
$r=-1$:
Perfectly adapted to the points, and the function decreases as $x$ increases.
This means that if we have ${r}^{2}=1$, the regression matches the points perfectly, and if ${r}^{2}=0$, there is no correlation between the points and the function. The larger ${r}^{2}$ is, the less the points deviate from the line. That means you want the largest possible ${r}^{2}$—but that’s out of your control.