2.2 Perceptron Algorithm in Machine Learning

Perceptron Algorithm in Machine Learning

The Perceptron is one of the earliest supervised learning algorithms. It is a binary classification algorithm used to separate data into two classes using a linear decision boundary.

It works only when the data is linearly separable.

Binary classifier in machine learning is a type of model that is trained to classify data into one of two possible categories, represented as binary labels such as 0 or 1, true or false, or positive or negative.

Example -

A Binary classifier may be trained to distinguish between spam and non-spam emails,

or to predict whether a credit card transaction is fraudulent or legitimate.

The Perceptron is one of the simplest artificial neural network architectures, introduced by Frank Rosenblatt in 1957.

It was designed to take a number of binary inputs and produce one binary output (0 or 1).

This algorithm enables neurons to learn and processes elements in the training set one at a time.

He proposed a Perceptron learning rule based on the original MCP (McCulloch-Pitts) neuron.

The basic components of a perceptron are:

1) Input Features: The perceptron takes multiple input features or attributes, each representing a characteristic of the input data.

2) Weights: Each input neuron or feature is associated with a weight, which represents the strength of the connection between the input neuron and the output neuron.

3) Bias: - A bias term is added to the input feature to provide the perceptron with additional flexibility in modelling complex patterns in the input data.

4) Summation Function: The perceptron calculates the weighted sum of its inputs, combining them with their respective weights.

5) Activation Function: - The activation function determines the output of the perceptron based on the weighted sum of the inputs and the bias term. The weighted sum is passed through the

6) step function, comparing it to a threshold to produce a binary output (0 or 1). Common activation functions used in perceptron’s include the step function, sigmoid function, and ReLU (Rectified Linear Unit) function.

7) Output: - The output of the perceptron is a single binary value, either 0 or 1, which indicates the class or category to which the input data belongs.

Training Algorithm: - The perceptron is a trained using a supervised learning algorithm such as the perceptron learning algorithm or backpropagation. During training, the perceptron adjusts its weights and biases using Perceptron learning algorithm to minimize the error between the predicted output and the true output for a given set of training examples.

Types of Perceptron

1) Single-Layer Perceptron is a type of perceptron is limited to learning linearly separable patterns. It is effective for tasks where the data can be divided into distinct categories through a straight line. it struggles with more complex problems where the relationship between inputs and outputs is non-linear.

2) Multi-Layer Perceptron has enhanced processing capabilities as they consist of two or more layers, adept at handling more complex patterns and relationships within the data. It is mainly similar to a single-layer perceptron model but has more hidden layers

Working process of Perceptron: -

step 1: - Definition

A perceptron tries to learn a line (in 2D), plane (in 3D), or hyperplane (in higher dimensions) that separates two classes.

Mathematically, it computes:

$y = f (w_{1} x_{1} + w_{2} x_{2} + . . . + w_{n} x_{n} + b)$

Where:

$x_1, x_2, ..., x_n$ → input features
$w_1, w_2, ..., w_n$ → weights
$b$ → bias
$f$ → activation function (step function)

Step 2: - Steps inside perceptron:

Multiply inputs with weights
Add bias
Apply step activation function
Output 0 or 1

step 3: - Step Activation Function

$f(z) = \begin{cases} 1 & \text{if } z \ge 0 \\ 0 & \text{if } z < 0 \end{cases}$

It converts the weighted sum into a binary output.

step 4: -Perceptron Learning Algorithm (Training Steps)

Given training data $(x, y)$ :

Initialize

Set weights $w$ and bias $b$ to small values (usually 0).

For each training example

Compute output:
$\hat{y} = f (w \cdot x + b)$
Update weights:
$w = w + η (y - \hat{y}) x$
Update bias:
$b = b + η (y - \hat{y})$

Where:

$\eta$ = learning rate
$y$ = actual output
$\hat{y}$
= predicted output

Repeat until no errors or maximum iterations reached.

Advantages

Simple and fast
Works well for linearly separable data
Foundation for neural networks

Limitations

Cannot solve non-linear problems (like XOR)
Only binary classification
Sensitive to learning rate

Simple Example

Problem:

Classify whether a student passes (1) or fails (0) based on study hours.

Study Hours (x)	Output (y)
1	0
2	0
3	1
4	1

We want a line that separates fail and pass.

Assume:

Initial weight $w = 0$
Bias $b = 0$
Learning rate $\eta = 1$

Iteration 1 (x = 1, y = 0)

$z = 0 (1) + 0 = 0$

Predicted = 1 (since z ≥ 0)

Error = 0 − 1 = −1

Update:

$w = 0 + (-1)(1) = -1$ $b = 0 + (- 1) = - 1$

After several iterations, weights adjust and the model learns a boundary like:

$w x + b = 0$

Example final model:

$x - 2.5 = 0$

Meaning:

If study hours ≥ 2.5 → Pass
If study hours < 2.5 → Fail

Search This Blog

ROHIT's Smart Class Room