2.15 Bias, Variance, and Bias–Variance Tradeoff
Bias, Variance, and Bias–Variance Tradeoff
In machine learning, when a model is trained using data, it may make errors while predicting new data.
Two major sources of prediction error are:
-
Bias
-
Variance
Understanding these helps us design models that generalize well to unseen data.
1. Bias: -
- Bias refers to the difference between the predicted values of a model and the actual values.
- It measures how far the model’s predictions are from the true values.
- If a model makes strong assumptions about the data, it may not capture the real patterns properly.
Characteristics of Bias
-
High Bias
-
Model is too simple
-
Fails to learn patterns in the data
-
Leads to underfitting
-
-
Low Bias
-
Model fits training data better
-
Captures patterns more accurately
-
Example
Suppose we want to predict house prices.
Actual relationship between variables may be non-linear.
If we use a simple straight line model (linear regression):
Price = a + b × Size
The model may miss complex patterns. This causes high bias.
Some algorithms with Higher Bias make stronger assumptions about data:
-
Linear Regression
-
Logistic Regression
-
Naive Bayes
These models may become too simple for complex problems.
Bias and Underfitting: -
High bias usually leads to underfitting.
Underfitting occurs when:
-
Model cannot capture patterns
-
Training error is high
-
Test error is also high
Example:
Trying to fit a straight line to data that actually follows a curve.
2. Variance
- Variance refers to how much the model’s predictions change when the training data changes.
- It measures the sensitivity of the model to training data.
- If a small change in training data leads to a very different model, the model has high variance.
Characteristics of Variance
-
High Variance
-
Model learns training data too well
-
Sensitive to small changes in data
-
Leads to overfitting
-
-
Low Variance
-
Model predictions remain stable
-
Less sensitive to changes in training data
-
Example
Suppose we train a model using Dataset A and get one prediction model. Then we train the same model using Dataset B and get a very different model. This means the model has high variance.
Some complex models tend to have high variance:
-
Decision Trees
-
Deep Neural Networks
-
Support Vector Machines (complex kernels)
These models can memorize training data, which leads to overfitting.
Variance and Overfitting
High variance leads to overfitting.
Overfitting occurs when:
-
Model performs very well on training data
-
Model performs poorly on test data
Example:
A model that perfectly memorizes training examples but fails to predict new cases.
3. Bias–Variance Tradeoff
- The Bias–Variance Tradeoff refers to the balance between bias and variance when building a machine learning model.
- A model cannot simultaneously have very low bias and very low variance.
- We must find the optimal balance between them.
High Bias, Low Variance → Model is too simple (underfitting).
Low Bias, High Variance → Model is too complex (overfitting).
Optimal Balance → Model generalizes well.
High Bias, Low Variance → Model is too simple (underfitting).
Low Bias, High Variance → Model is too complex (overfitting).
Optimal Balance → Model generalizes well.
Tradeoff
-
Increasing model complexity
-
Reduces bias
-
Increases variance
-
-
Reducing model complexity
-
Increases bias
-
Reduces variance
-
The goal is to find a model that gives the lowest total prediction error.
Total Error
Total prediction error can be expressed as:
Total Error = Bias² + Variance + Irreducible Error
Where:
-
Bias² → error due to wrong assumptions
-
Variance → error due to model sensitivity
-
Irreducible error → random noise in data
We try to optimize the value of the total error for the model by
using the Bias-Variance Tradeoff.
Example
Suppose we build a model to predict student marks based on:
-
Study hours
-
Attendance
-
Assignments
Model 1 (Very Simple)
Uses only study hours.
Result: High bias → Underfitting.
Model 2 (Very Complex)
Uses many irrelevant features.
Result: High variance → Overfitting.
Model 3 (Balanced Model)
Uses relevant features and appropriate complexity.
Result: Best predictions on new data.
The Bias-Variance Tradeoff helps us find the best model that balances complexity and prediction accuracy.