ML 1.2 Supervised Learning in Machine Learning

Supervised Learning in Machine Learning

Supervised learning is a machine learning approach where a model is trained using labeled data.

Each training example includes input data and the correct output.

The model learns the relationship between inputs and outputs and then uses that learning to make predictions on new, unseen data.

In simple terms, the system learns under guidance, similar to how a student learns from a teacher who provides correct answers.

Supervised Learning Works

Collect labeled data (input + correct output)
Split data into training and testing sets
Train the model using the training data
Test the model on unseen data
Use the model for real-time predictions

Real-Time Examples of Supervised Learning

1. Email Spam Detection (Classification)

Input: Email content, sender, keywords
Output: Spam or Not Spam
Real-time use: Gmail filters spam emails automatically based on past labeled emails.

2. Student Performance Prediction (Regression)

Input: Attendance, internal marks, assignment scores
Output: Final exam marks
Real-time use: Colleges predict student results and identify students who need academic support.

3. Face Recognition (Classification)

Input: Image pixels
Output: Person name or Unknown
Real-time use: Mobile phone face unlock and attendance systems.

4. House Price Prediction (Regression)

Input: Location, area size, number of rooms
Output: House price
Real-time use: Real estate websites estimate property prices instantly.

5. Medical Diagnosis (Classification)

Input: Symptoms, test results
Output: Disease type or No disease
Real-time use: Decision-support systems assist doctors in diagnosis.

6. Credit Approval Systems (Classification)

Input: Income, credit score, employment history
Output: Loan Approved or Rejected
Real-time use: Banks evaluate loan applications instantly.

Common Algorithms Used in Supervised Learning

Linear Regression
Logistic Regression
Decision Tree
Random Forest
Support Vector Machine (SVM)
k-Nearest Neighbors (KNN)
Naive Bayes

Types of Supervised Learning

1. Classification

Used when the output is a category or class.

Examples of output:

Yes / No
Spam / Not Spam
Pass / Fail

Classification in Machine Learning

Classification is a supervised learning technique used to assign input data into predefined categories or classes.

The model learns from labeled examples and then predicts the class label for new data.

In short, classification answers the question. “Which category does this data belong to?”

Classification is a type of supervised learning where the algorithm learns to assign input data to a specific category or class based on input features.
Classification teaches a machine to sort things into categories.
The output labels in classification are discrete values.
Classification algorithms can be binary, where the output is one of two possible classes, or multiclass, where the output can be one of several classes.
The different Classification algorithms in machine learning are: Logistic Regression, Naive Bayes, Decision Tree, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), etc

Classification Works

Collect labeled data (input + class label)
Train the model to learn patterns
Provide new input data
Model predicts the most suitable class

Types of Classification

1. Binary Classification

Only two possible classes.

Examples:

Yes / No
True / False
Pass / Fail

Real-time examples:

Email: Spam or Not Spam
Loan: Approved or Rejected
Medical test: Disease or No Disease

2. Multiclass Classification

More than two classes.

Examples:

Grades: A, B, C, D
Traffic signals: Red, Yellow, Green

Real-time examples:

Handwritten digit recognition (0 to 9)
Student grade prediction
Language detection (English, Hindi, Telugu)

3. Multi-label Classification

One input can belong to multiple classes at the same time.

It is used when there are two or more classes and the data, we want to classify may belong to none of the classes or all of them at the same time, e.g. to classify which traffic signs are contained on an image.
multi-label classification allows datapoints to belong to multiple classes.

Real-time examples:

A movie tagged as Action, Drama, and Thriller

News article labeled as Politics and Economy

Common Classification Algorithms

Logistic Regression
Decision Tree
Random Forest
Support Vector Machine (SVM)
Naive Bayes
k-Nearest Neighbors (KNN)

Real-Time Classification Examples Explained

1. Email Spam Detection

Input: Email text, sender details
Classes: Spam, Not Spam
Type: Binary Classification

2. Student Result Prediction

Input: Attendance, marks, assignments
Classes: Pass, Fail
Type: Binary Classification

3. Face Recognition

Input: Image pixels
Classes: Person A, Person B, Unknown
Type: Multiclass Classification

4. Disease Diagnosis

Input: Symptoms, test reports
Classes: Diabetes, Heart Disease, Normal
Type: Multiclass Classification

5. Customer Feedback Analysis

Input: Review text
Classes: Positive, Neutral, Negative
Type: Multiclass Classification

2. Regression

Used when the output is a numerical value.

Regression is a supervised learning technique used to predict a continuous numerical value based on input features.
Unlike classification, which predicts categories, regression predicts quantities.

In simple terms, regression answers the question:
“How much?” or “What value?”

Examples of output:

Price
Temperature
Marks

The different regression algorithms in machine learning are: Linear Regression, Polynomial Regression, Ridge Regression, Decision Tree Regression, Random Forest Regression, Support Vector Regression, etc.

Ø We have two types of variables present in regression:

· Dependent Variable (Target): The variable we are trying to predict e.g. house price.

· Independent Variables (Features): The input variables that influence the prediction e.g. locality, number of rooms.

Ø The linear regression model provides a sloped straight line representing the relationship between the variables.

Regression Works

Collect labeled data where the output is a number
The model learns the relationship between inputs and the numerical output
For new input data, the model predicts a value

Simple Example to Understand Regression

House Price Prediction

Input features:
- Area (square feet)
- Number of bedrooms
- Location
Output:
- House price (in rupees)

If a model is trained on past house sales data, it can predict the price of a new house based on its features.

This is regression because the output is a number, not a category.

Real-Time Regression Examples

1. Student Marks Prediction

Input: Attendance percentage, internal marks, assignment scores
Output: Final exam marks
Use case: Identifying students who may need academic support

2. Weather Forecasting

Input: Temperature, humidity, wind speed
Output: Predicted temperature or rainfall amount

3. Sales Forecasting

Input: Past sales data, season, promotions
Output: Expected sales value for next month

4. Salary Prediction

Input: Experience, education level, skills
Output: Estimated salary

5. Stock Price Estimation

Input: Historical prices, volume, trends
Output: Predicted stock price

Types of Regression

1. Linear Regression

Simple Linear Regression aims to describe how one variable i.e the dependent variable changes in relation with reference to the independent variable.

example, predicting the price of a house based on its size.

The relationship between the dependent and independent variables is represented by the simple linear equation:

Y = mx+by = mx+b

Output changes linearly with input
Example: Salary increases with experience

2. Multiple Linear Regression

It extends simple linear regression by using multiple independent variables to predict target variable.
Multiple Linear Regression attempts to model the relationship between two or more features and a response by fitting a linear equation to observed data.
We can use it to find out which factor has the highest impact on the predicted output and how different variables relate to each other.
For example, predicting the price of a house based on multiple features such as size, location, number of rooms, etc.

Uses multiple inputs
Example: House price based on area, rooms, and location

3. Polynomial Regression

Polynomial Regression is a form of linear regression in which the relationship between the independent variable x and dependent variable y is modelled as an nth-degree polynomial. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y | x).
While simple linear regression models the relationship as a straight line, polynomial regression allows for more flexibility by fitting a polynomial equation to the data(curve).
The general form of a polynomial regression equation of degree n is:
Y = β0 + β1 x+ β2 x²+…+βn xⁿ+ e (or) y = a + b₁ x + b₂x² + e
where,

· y is the dependent variable.

· x is the independent variable.

· β0,β1,…,βn are the coefficients of the polynomial terms.

· n is the degree of the polynomial.

· e represents the error term.

Models non-linear relationships
Example: Growth rate that accelerates over time

iv. Ridge & Lasso Regression:

Ø Ridge & lasso regression are regularized versions of linear regression that help avoid overfitting by penalizing large coefficients.

Ø When there’s a risk of overfitting due to too many features we use these type of regression algorithms.

v. Support Vector Regression (SVR):

Ø SVR is a type of regression algorithm that is based on the Support Vector Machine (SVM) algorithm.

Ø SVM is a type of algorithm that is used for classification tasks but it can also be used for regression tasks.

Ø SVR works by finding a hyperplane that minimizes the sum of the squared residuals between the predicted and actual values.

vi. Decision Tree Regression:

Ø Decision tree Uses a tree-like structure to make decisions where each branch of tree represents a decision and leaves represent outcomes.

Ø For example, predicting customer behaviour based on features like age, income, etc there we use decision tree regression.

vii. Random Forest Regression:

Ø Random Forest is an ensemble method that builds multiple decision trees and each tree is trained on a different subset of the training data. The final prediction is made by averaging the predictions of all of the trees.

Ø For example, customer sales data using this.

Common Regression Algorithms

Linear Regression
Multiple Linear Regression
Polynomial Regression
Decision Tree Regression
Random Forest Regression
Support Vector Regression (SVR)

Regression vs Classification (Quick View)

Aspect	Regression	Classification
Output	Numerical value	Category
Example	Price, marks	Spam/Not spam
Question answered	How much?	Which class?