Posts

2.17 Weight Regularization in Machine Learning

Image
Weight Regularization in Machine Learning Regularization is a technique used to prevent overfitting in machine learning models . Regularization is an important technique used to control model complexity and prevent overfitting . Regularization helps machine learning models perform better on new data rather than memorizing training data . Sometimes a model performs very well on training data but poorly on test data . This happens because the model memorizes the training data instead of learning general patterns . This problem is called overfitting . Regularization helps solve this problem by adding a penalty to the model so that it does not become too complex .  Regularization is Needed .  In machine learning, models may become too complex when: There are many features The model tries to perfectly fit the training data Noise is present in the dataset As a result: Training accuracy becomes very high Test accuracy becomes low Regularization reduces this...

2.16 Support Vector Machine (SVM)

Image
Support Vector Machine (SVM) Support Vector Machine (SVM) is a supervised machine learning algorithm used for both:  Classification problems and  Regression problems. However, SVM is mostly used for classification tasks . The goal of SVM is to find the best boundary that separates data into different classes . This boundary is called a Hyperplane . The hyperplane divides the dataset so that new data points can be classified correctly . Hyperplane A hyperplane is a decision boundary that separates different classes. The dimension of the hyperplane depends on the number of features. Example: If we have two features: Height Weight The hyperplane will be a straight line separating two classes. Support Vectors Support vectors are data points that are closest to the hyperplane . These points are important because they determine the position of the hyperplane . If support vectors move, the hyperplane also moves. Margin The margin is the distance between: The hy...

2.15 Bias, Variance, and Bias–Variance Tradeoff

Image
Bias, Variance, and Bias–Variance Tradeoff In machine learning, when a model is trained using data, it may make errors while predicting new data. Two major sources of prediction error are: Bias Variance Understanding these helps us design models that generalize well to unseen data .  1. Bias: - Bias refers to the difference between the predicted values of a model and the actual values . It measures how far the model’s predictions are from the true values . If a model makes strong assumptions about the data , it may not capture the real patterns properly. Characteristics of Bias High Bias Model is too simple Fails to learn patterns in the data Leads to underfitting Low Bias Model fits training data better Captures patterns more accurately Example Suppose we want to predict house prices .                Actual relationship between variables may be non-linear .           ...

2.14 Hypothesis Testing (Statistics & Machine Learning)

Image
Hypothesis Testing (Statistics & Machine Learning) A hypothesis is an assumption or statement about a population or data. In machine learning , a hypothesis represents the relationship between input variables (features) and output variables (target values) that a model learns from data. The goal of machine learning is to find the best hypothesis that can accurately predict results on new or unseen data . Hypothesis testing is a statistical method used to evaluate assumptions about a population using sample data . It helps us decide whether a statement about data is likely true or false . Hypothesis testing involves two main hypotheses : Null Hypothesis (H₀) Alternative Hypothesis (H₁ or Ha) 1. Null Hypothesis (H₀) The Null Hypothesis is the initial assumption that there is no significant difference or relationship between variables. It represents the default or status quo statement . Example A company claims that its average daily production is 50 units .   ...

2.13 Cross Validation in Machine Learning

Image
Cross Validation in Machine Learning Cross Validation is a statistical method used to evaluate the performance of a machine learning model . It helps us understand how well the model will perform on unseen or new data . When a model is trained using a dataset, it may perform very well on that dataset but fail when predicting new data . Cross validation helps prevent this problem.  Cross validation helps to: Evaluate model performance more accurately Prevent overfitting Select the best machine learning model Improve model reliability and generalization Example: A model shows 95% accuracy on training data . But when tested on new data it gives 65% accuracy . This means the model memorized the training data instead of learning patterns . Cross validation helps detect such problems. Overfitting Overfitting occurs when the model learns the training data too well , including noise and unnecessary details. As a result, the model performs: Very well on training data Poo...