2.17 Weight Regularization in Machine Learning
Weight Regularization in Machine Learning
- Regularization is a technique used to prevent overfitting in machine learning models.
- Regularization is an important technique used to control model complexity and prevent overfitting.
- Regularization helps machine learning models perform better on new data rather than memorizing training data.
- Sometimes a model performs very well on training data but poorly on test data.
- This happens because the model memorizes the training data instead of learning general patterns. This problem is called overfitting.
- Regularization helps solve this problem by adding a penalty to the model so that it does not become too complex.
Regularization is Needed . In machine learning, models may become too complex when:
-
There are many features
-
The model tries to perfectly fit the training data
-
Noise is present in the dataset
As a result:
-
Training accuracy becomes very high
-
Test accuracy becomes low
Regularization reduces this problem by controlling the size of model coefficients (weights).
Regularization keeps all features but reduces the impact of less important features.
Types of Regularization
There are mainly two regularization techniques:
-
Ridge Regression (L2 Regularization)
-
Lasso Regression (L1 Regularization)
| Ridge Regression | Shrinks coefficients but keeps all features |
| Lasso Regression | Shrinks coefficients and removes unnecessary features |
1. Ridge Regression (L2 Regularization): -
- Ridge regression is a regularization technique that reduces model complexity by shrinking coefficient values.
- It introduces a penalty based on the square of coefficients.
- It is also called as L2 regularization.
Cost Function : Cost = SSE + λ Σ (β²)
Where:
|
|
λ (lambda) - regularization parameter β - coefficient weights
Lambda controls the strength of regularization
| Lambda Value | Effect |
|---|
|
|
Small λ Slight regularization Large λ Strong regularization
Important Characteristics
-
Reduces coefficient values
-
Does not eliminate features
-
Keeps all features in the model
-
Helps reduce multicollinearity
Example
Suppose we are predicting house prices using features:
-
Size
-
Number of rooms
-
Location
-
Age of house
If some features are less important, Ridge regression reduces their coefficients, but does not remove them.
2. Lasso Regression (L1 Regularization): -
- Lasso regression is a regularization technique that reduces model complexity and performs feature selection.
- It adds a penalty based on the absolute value of coefficients.
- It is also called as L1 regularization.
Cost Function : Cost = SSE + λ Σ |β|
Key Characteristics
-
Shrinks coefficients toward zero
-
Some coefficients become exactly zero
-
Performs automatic feature selection
Example
Suppose we have 10 features for predicting salary.
Lasso regression may reduce some coefficients to zero, meaning those features are removed from the model. This helps simplify the model.
Example of Regularization in Real Life
Suppose we build a model to predict student performance.
Features include:
-
Study hours
-
Attendance
-
Social media usage
-
Sleep time
-
Class participation
Some features may not strongly affect performance.
Regularization reduces the impact of unnecessary features, improving model performance.
Advantages of Regularization
-
Prevents overfitting
-
Improves model generalization
-
Handles multicollinearity
-
Reduces model complexity
-
Improves prediction accuracy on unseen data