Advanced Mathematical Statistics

I worked on two different clustering algorithms: K-means clustering and DBSCAN. The purpose of this analysis was to group similar data points together based on their geographical coordinates (latitude and longitude) using different clustering algorithms. The choice of clustering algorithms and the number of clusters (in this case, 3 clusters) might be determined based on the nature of the data and the specific goals of the analysis. The visualization helps in understanding how well the algorithms have grouped the data points into distinct clusters.

K-means Clustering:

K-means grouped the data into 3 clusters. Each data point was assigned to one of these clusters.
K-means assumes that clusters are spherical and equally sized, which might not be the case for geographical data. Therefore, the effectiveness of K-means in this context depends on the distribution of your data.

DBSCAN Clustering:

DBSCAN identified clusters based on dense regions of data points. Outliers or sparse points that don’t belong to any dense cluster were labeled as -1.
DBSCAN doesn’t require specifying the number of clusters beforehand, making it more flexible, especially when the clusters have different shapes and densities.

Leave a Reply Cancel reply