1. 11 Underfitting and Overfitting in Machine Learning

Underfitting and Overfitting in Machine Learning 

In machine learning, a model is considered good when it:

  1. Learns patterns from the training data.

  2. Performs well on new, unseen data.

  3. Does not simply memorize the training data.

  4. Does not ignore important patterns.

    To check this, we compare performance on:

  • Training data

  • Validation or test data

    Two major problems that affect performance are underfitting and overfitting. These are closely related to bias and variance.


Bias

Bias is the error caused when a model is too simple to understand the real pattern in the data.

  • High bias means the model makes strong assumptions.

  • It ignores important relationships.

  • It leads to underfitting.

Example:
Using a straight line (linear regression) to model data that actually follows a curve.

Assuming all birds can fly. The model ignores birds like ostrich and penguin.

Result:

  • Poor performance on training data.

  • Poor performance on test data.

                                                            High bias = Underfitting
                                                            Low variance


Variance

Variance is the error caused when a model learns too much from the training data, including noise.

  • High variance means the model is too sensitive to training data.

  • It captures noise instead of the real pattern.

  • It leads to overfitting.

Example:
Fitting a very complex curve that passes through every training point.

Result:

  • Very high accuracy on training data.

  • Poor performance on test data.

                                                High variance = Overfitting
                                                Low bias


Underfitting

  • underfitting where the model performs poorly on both the training data and new data.
  • Underfitting happens when the model is too simple to capture the actual pattern in the data.
  • A model that cannot learn the underlying relationship in the data.
  • Imagine data points forming a curve, but the model draws only a straight line. The line does not follow the pattern.

Example: The student didn't study enough and doesn't understand the basic formulas.

Characteristics

  • High bias

  • Low variance

  • Poor training accuracy

  • Poor testing accuracy

Reasons for Underfitting

  1. Model is too simple.

  2. Important features are missing.

  3. Very small training dataset.

  4. Too much regularization.

  5. Features are not properly scaled.

Note: The underfitting model has High bias and low variance.

Overfitting

  • overfitting where the model performs well on the training data but poorly on new data.
  • Overfitting happens when the model learns too much from training data, including noise and outliers.
  • A model that memorizes training data instead of learning general patterns.



Example: The student memorized the exact practice problems from the textbook but can't solve a slightly different problem on the actual test.

Characteristics

  • Low bias

  • High variance

  • Very high training accuracy

  • Low testing accuracy

Reasons for Overfitting

  1. Model is too complex.

  2. Too many features.

  3. Small training dataset.

  4. No regularization.

  5. Noise in training data.

Note: The overfitting model has low bias and high variance.


·        Underfitting: Straight line trying to fit a curved dataset but cannot capture the data’s patterns, leading to poor performance on both training and test sets.

·        Overfitting: A squiggly curve passing through all training points, failing to generalize performing well on training data but poorly on test data.

·        Appropriate Fitting: Curve that follows the data trend without overcomplicating to capture the true patterns in the data.




Bias–Variance Tradeoff using a target board example.

  • The center of the target = true value (correct prediction)

  • The dots = model predictions

  • Bias = how far predictions are from the center

  • Variance = how spread out the predictions are


1) Low Bias – Low Variance
  • Predictions are close to the true value and tightly grouped.
  • Model is accurate and consistent.

Performance:

  • Good training accuracy.

  • Good test accuracy.

  • Best case scenario.

This is the ideal model.

2) Low Bias – High Variance
  • Predictions are around the true value on average, But they are widely spread.
  • Model understands the pattern. But it changes a lot with small changes in data.

Performance:

  • Very high training accuracy.

  • Poor test accuracy.

  • Model is unstable.


This is overfitting. The model learns patterns but also learns noise.


3) High Bias – Low Variance
  • Predictions are tightly grouped, But far from the true value.
  • Model is consistent, But consistently wrong.

Performance:

  • Low training accuracy.

  • Low test accuracy.

This is underfitting. The model is too simple to capture the pattern.


4) High Bias – High Variance
  • Predictions are far from the center and are also widely spread.
  • Model is inaccurate and inconsistent.

Performance:

  • Very poor training accuracy.

  • Very poor test accuracy.

This is the worst situation. Model is both too simple in understanding and unstable.




































Popular posts from this blog

operators in c programming

2.4 Arrays in c programming

Variables in c