Hierarchical Clustering

Learn all about hierarchical clustering and how to cluster data with it using scikit-learn.

We'll cover the following

Agglomerative clustering
Divisive clustering
The scikit-learn implementation
Limitations
Conclusion

Hierarchical clustering is another popular unsupervised clustering algorithm that groups data points into clusters based on similarity. It works by building a hierarchy of clusters, starting with individual data points and gradually merging them into larger clusters.

There are two types of hierarchical clustering: agglomerative and divisive.

Agglomerative clustering

Agglomerative clustering is a hierarchical clustering algorithm that groups data points based on their pairwise distances or similarities. Unlike k-means or DBSCAN, agglomerative clustering doesn’t require specifying the number of clusters in advance. Instead, it builds a hierarchy of clusters by iteratively merging the most similar or nearby data points or clusters.

The algorithm starts by considering each data point as a separate cluster. It then repeatedly merges the two closest clusters based on a chosen linkage criterion, which determines the distance or similarity between clusters. The most commonly used linkage criteria are as follows:

Ward: This minimizes the variance of the distances between the clusters being merged.
Complete: This maximizes the distance between the closest points of the clusters being merged.
Average: This uses the average distance between all pairs of points in the two clusters being merged.

The choice of linkage criterion can have a significant impact on the clustering results, as it affects the shape and structure of the clusters.

Agglomerative clustering continues merging clusters until all data points are grouped into a single cluster or until a stopping criterion is met. This stopping criterion can be a specified number of clusters, a distance threshold, or the maximum number of iterations. The resulting hierarchy of clusters can be represented as a dendrogram, which illustrates the merging process and allows for different levels of granularity in cluster identification.

Agglomerative clustering offers several advantages. It can handle clusters of different sizes, shapes, and densities. It can also detect nested or overlapping clusters, capturing the hierarchical structure of the data. Agglomerative clustering is flexible and can be applied to various types of data and distance/similarity metrics. Additionally, it provides a natural way to explore the data at different levels of granularity, allowing for the identification of both fine-grained and coarse-grained clusters.

Get hands-on with 1400+ tech skills courses.

Course Overview

Introduction to Machine Learning

Preprocessing

Supervised Learning

Unsupervised Learning

Model Evaluation

How to Predict the Traffic Volume Using Machine Learning

Tips and Tricks

Conclusion

Customer Segmentation with K-Means Clustering

Hierarchical Clustering

Agglomerative clustering