2.2 Perceptron Algorithm in Machine Learning
Perceptron Algorithm in Machine Learning
- The Perceptron is one of the earliest supervised learning algorithms. It is a binary classification algorithm used to separate data into two classes using a linear decision boundary.
- It works only when the data is linearly separable.
- Binary classifier in machine learning is a type of model that is trained to classify data into one of two possible categories, represented as binary labels such as 0 or 1, true or false, or positive or negative.
Example -
A Binary classifier may be trained to distinguish between spam and non-spam emails,
or to predict whether a credit card transaction is fraudulent or legitimate.
- The Perceptron is one of the simplest artificial neural network architectures, introduced by Frank Rosenblatt in 1957.
- It was designed to take a number of binary inputs and produce one binary output (0 or 1).
- This algorithm enables neurons to learn and processes elements in the training set one at a time.
- He proposed a Perceptron learning rule based on the original MCP (McCulloch-Pitts) neuron.
The basic
components of a perceptron are:
1) Input
Features: The
perceptron takes multiple input features or attributes, each representing a
characteristic of the input data.
2) Weights: Each input neuron or feature is associated with a weight, which represents the strength of the connection between the input neuron and the output neuron.
3) Bias: - A bias term is added to the input feature to provide the perceptron with additional flexibility in modelling complex patterns in the input data.
4) Summation Function: The perceptron calculates the weighted sum of its inputs, combining them with their respective weights.
5) Activation Function: - The activation function determines the output of the perceptron based on the weighted sum of the inputs and the bias term. The weighted sum is passed through the
6) step function, comparing it to a threshold to produce a binary output (0 or 1). Common activation functions used in perceptron’s include the step function, sigmoid function, and ReLU (Rectified Linear Unit) function.
7) Output: - The output of the perceptron is a single binary value, either 0 or 1, which indicates the class or category to which the input data belongs.
Training
Algorithm: - The
perceptron is a trained using a supervised learning algorithm such as the perceptron
learning algorithm or backpropagation. During training, the
perceptron adjusts its weights and biases using Perceptron learning algorithm
to minimize the error between the predicted output and the true output for a
given set of training examples.
Types of
Perceptron
1) Single-Layer Perceptron is a type of perceptron is limited to learning
linearly separable patterns. It is effective for tasks where the data can be
divided into distinct categories through a straight line. it struggles with
more complex problems where the relationship between inputs and outputs is
non-linear.
2) Multi-Layer Perceptron has enhanced processing capabilities as they consist of two or more layers, adept at handling more complex patterns and relationships within the data. It is mainly similar to a single-layer perceptron model but has more hidden layers
Working process of Perceptron: -
step 1: - Definition
A perceptron tries to learn a line (in 2D), plane (in 3D), or hyperplane (in higher dimensions) that separates two classes.
Mathematically, it computes:
Where:
-
→ input features
-
→ weights
-
→ bias
-
→ activation function (step function)
Step 2: - Steps inside perceptron:
-
Multiply inputs with weights
-
Add bias
-
Apply step activation function
-
Output 0 or 1
step 3: - Step Activation Function
It converts the weighted sum into a binary output.
step 4: -Perceptron Learning Algorithm (Training Steps)
Given training data :
Initialize
-
Set weights and bias to small values (usually 0).
For each training example
-
Compute output:
-
Update weights:
-
Update bias:
Where:
-
= learning rate
-
= actual output
-
= predicted output
Repeat until no errors or maximum iterations reached.
Advantages
-
Simple and fast
-
Works well for linearly separable data
-
Foundation for neural networks
Limitations
-
Cannot solve non-linear problems (like XOR)
-
Only binary classification
-
Sensitive to learning rate
Simple Example
Problem:
Classify whether a student passes (1) or fails (0) based on study hours.
| Study Hours (x) | Output (y) |
|---|---|
| 1 | 0 |
| 2 | 0 |
| 3 | 1 |
| 4 | 1 |
We want a line that separates fail and pass.
Assume:
-
Initial weight
-
Bias
-
Learning rate
Iteration 1 (x = 1, y = 0)
Predicted = 1 (since z ≥ 0)
Error = 0 − 1 = −1
Update:
After several iterations, weights adjust and the model learns a boundary like:
Example final model:
Meaning:
-
If study hours ≥ 2.5 → Pass
-
If study hours < 2.5 → Fail