DBSCAN vs. HDBSCAN. Which is Better?

HDBSCAN (Hierarchical Density-Based Spatial Clustering of Applications with Noise) is an extension to
the DBSCAN algorithm and has three main parameters (min_cluster_size, min_samples, and cluster_selection_epsilon)
to control the clustering process.
This article describes the major differences between min_cluster_size, min_samples, and cluster_selection_epsilon
in HDBSCAN and how they control the clustering process.
min_cluster_size
min_cluster_size parameter sets the minimum number of points required for forming the cluster. It helps in filtering out
small clusters and noise.
min_cluster_size should be set based on the input dataset and the number of clusters we intend to achieve. The higher the
value of min_cluster_size can reduce the number of clusters while merging some clusters.
To get specific clusters, keep this value low as clusters with few points can be also important.
min_samples
The min_samples parameter is from the DBSCAN. It defines the number of data points required to form core points (dense regions).
A core point has at least minPts data points within an eps radius.
By default, the value of min_samples is same as min_cluster_size in HDBSCAN. The change in this value has a signifcant effect on clustering.
Lowering min_samples can help to restore original clustering, which can be lost if min_cluster_size is too high.
cluster_selection_epsilon
cluster_selection_epsilon specifies the maximum radius within which points are considered to belong to the same cluster. This is more similar to
the eps value in DBSCAN.
cluster_selection_epsilon will help to keep the cluster intact up to certain threshold and helps to prevent splitting of the clusters.