4.4 Divisive Clustering in Unsupervised Learning
Divisive Clustering in Unsupervised Learning: -
Divisive clustering is a type of hierarchical clustering that follows a top-down approach.
-
Start with all data points in one single cluster
-
Gradually split the cluster into smaller clusters
-
Continue splitting until each point becomes separate or required clusters are formed
Divisive clustering is a hierarchical unsupervised learning technique that starts with one cluster and recursively splits it into smaller clusters based on dissimilarity.
Divisive clustering starts with one large cluster and divides it step by step into smaller clusters.
Divisive Clustering Works (Step-by-Step)
Step 1: Start
- Put all data points into one cluster.
Example:
Points → A, B, C, D
Cluster → {A, B, C, D}
Step 2: Find Differences
Identify points that are very different from others.
Step 3: Split the Cluster
Divide the cluster into two groups based on differences.
Step 4: Repeat
Keep splitting clusters until:
-
Desired number of clusters is reached, or
-
Each data point becomes its own cluster
Example
Let’s take student marks:
Step 1:
All in one cluster → {A, B, C, D}
Step 2: Split based on similarity
-
High marks → A, B
-
Low marks → C, D
Clusters → {A, B} and {C, D}
Step 3: Further Split
Cluster {A, B} → {A}, {B}
Cluster {C, D} → {C}, {D}
We continue dividing these new groups. At this stage most of the data points are now in their individual groups.
Final Clusters:
-
{A}, {B}, {C}, {D}
(or stop earlier if only 2 clusters are needed)
Advantages
-
Gives a clear top-down structure
-
Useful when large cluster needs to be divided
Disadvantages
-
More complex than agglomerative
-
Computationally expensive
-
Less commonly used
Applications
-
Document classification
-
Image segmentation
-
Biological data analysis