4.3 Agglomerative Algorithm or Agglomerative Clustering

- March 17, 2026

Agglomerative Clustering: -

Agglomerative clustering is a type of hierarchical clustering that follows a bottom-up approach.

Start with each data point as its own cluster
Gradually merge the closest clusters
Continue until all points form one cluster (or desired groups are formed)

Agglomerative clustering is a hierarchical unsupervised learning algorithm that builds clusters by iteratively merging the closest data points or clusters.

Agglomerative Algorithm Works (Step-by-Step):-

Step 1: Start

Each data point is considered as a separate cluster.

           Example:
                               Points → A, B, C, D
                               Clusters → {A}, {B}, {C}, {D}

Step 2: Calculate Distance

Find distance between all clusters. (Usually Euclidean distance is used)

Step 3: Merge Closest Clusters

Combine the two clusters that are closest.

Example: If A and B are closest → merge → {A, B}

Step 4: Update Distances

Recalculate distance between new clusters.

Step 5: Repeat

Keep merging until:

Only one cluster remains, or
Required number of clusters is reached

Example

Let’s take 4 data points:

Step 1:
Clusters → {A}, {B}, {C}, {D}

Step 2: Find closest points

A and B → distance = 1 (smallest)

Merge → {A, B}

Step 3:
Clusters → {A, B}, {C}, {D}

C and D → distance = 2

Merge → {C, D}

Step 4:
Clusters → {A, B}, {C, D}

Now only 2 clusters remain.

Final Clusters:

Cluster 1 → {A, B}
Cluster 2 → {C, D}

Linkage Methods

When merging clusters, we need to define how distance is calculated between clusters.

1. Single Linkage

Distance = minimum distance between points (closest points)

2. Complete Linkage

Distance = maximum distance (farthest points)

3. Average Linkage

Distance = average distance between all points (overall average)

Dendrogram (Tree Diagram)

Agglomerative clustering is often shown using a dendrogram.

It is a tree-like diagram
Shows how clusters are merged step by step
Cutting the tree at a level gives clusters

Advantages

No need to choose number of clusters initially
Easy to understand
Works well for small datasets

Disadvantages

Slow for large datasets
Cannot undo merging (once merged, always merged)
Sensitive to noise

Applications

Document clustering
Image segmentation
Gene analysis
Customer grouping

Search This Blog

ROHIT's Smart Class Room