7  Statistics

Bias Error
Produces prejudiced results Results in inaccurate outcomes
Identified manually or through software packages Identified through calculations
Occurs systematically Occurs randomly

\[ Var(X) = E[X^2] – E[X]^2 \]

Signs of high bias ML model Signs of high variance ML model
Failure to capture data trends Noise in data set
Underfitting Overfitting
Overly simplified Complexity
High error rate Forcing data points together

\[\begin{aligned} R^{2} &= 1 - \frac{\text{Residual variance}}{\text{Total variance}} \\ &=\frac{\text{Total variance - Residual variance}}{\text{Total variance}} \\ &=\frac{\text{Explained variance}}{\text{Total variance}} \\ &=\text{Fraction of total variance explained} \end{aligned}\]