2.12 Regression Model Evaluation Methods
Regression Model Evaluation Methods: -
- Regression is a machine learning technique used to predict continuous (numerical) values such as price, temperature, salary, marks, etc.
- In regression, we try to find the relationship between:
Independent variables (X) – input variables
Dependent variable (Y) – output variable
Example:
Predicting house price based on size, location, and number of rooms.
Since predicted values may not be exactly equal to actual values, we measure the error between them. These errors help us understand how well the regression model performs.
To measure these errors, we use evaluation metrics.
Common regression evaluation metrics include:
1. Mean Absolute Error (MAE)
2. Mean Squared Error (MSE)
3. Root Mean Squared Error (RMSE)
4. R-Squared (additional important metric)
1. Mean Absolute Error (MAE): -
- Mean Absolute Error is the average of the absolute differences between predicted values and actual values.
- It measures how far the predictions are from the real values on average.
Formula
MAE = (1/n) Σ |yi − ŷi|
Where:
-
n = number of data points
-
yi = actual value
-
ŷi = predicted value
-
| | = absolute value (ignores negative sign)
Explanation
- Calculate the difference between actual and predicted values.
- Convert the difference into absolute value.
- Add all the absolute errors.
- Divide by the number of observations.
Example
MAE = (10 + 10 + 5) / 3
MAE = 8.33
Interpretation
On average, the prediction error is 8.33 units.
Advantages
-
Easy to understand
-
Less sensitive to outliers
2. Mean Squared Error (MSE)
- Mean Squared Error is the average of the squared differences between actual values and predicted values.
Formula
MSE = (1/n) Σ (yi − ŷi)²
Where:
-
n = number of data points
-
yi = actual value
-
ŷi = predicted value
Explanation
- Find the difference between actual and predicted values.
- Square the difference.
- Add all squared values.
- Divide by the number of observations.
Example
MSE = (100 + 100 + 25) / 3
MSE = 75
Interpretation
The average squared prediction error is 75.
Advantages
-
Penalizes large errors more strongly
-
Widely used in machine learning algorithms
Disadvantage
-
Harder to interpret because units are squared.
3. Root Mean Squared Error (RMSE)
- Root Mean Squared Error is the square root of the Mean Squared Error (MSE).
- It measures how far the predicted values are from the actual values in the same unit as the data.
Formula
RMSE = √( (1/n) Σ (yi − ŷi)² )
Where:
-
n = number of data points
-
yi = actual value
-
ŷi = predicted value
Example
4. R-Squared (Coefficient of Determination)
- R² measures how well the regression model explains the variation in the dependent variable.
Formula
R² = 1 − (SSres / SStotal)
Where:
-
SSres = Sum of squared residual errors
-
SStotal = Total sum of squares
Range
Example
If R² = 0.85
It means the model explains 85% of the variation in the data.
Regression evaluation metrics help us measure the performance of a regression model.
-
MAE measures the average absolute error.
-
MSE measures the average squared error.
-
RMSE shows the error in the original unit.
-
R² shows how well the model explains the data.