Publication Title : Mean Shift Algorithm for Machine Learning: A Modification
Publicationed By : Chinmay Bepery
Publication Publication Date : 2022-02-11 00:00:00
Publication Online Link :
Publication Description :
Clustering is a machine learning approach used to group similar data points. Mean shift is a constant window-based clustering algorithm that can calculate the number of clusters in evitably however fail to guarantee the convergence of the algorithm. The primary downside of the Mean shift algorithm is that the algorithm requires to set a stopping criterion (threshold point) otherwise, all clusters pass toward one cluster, and constant bandwidth is used here. It fails to outline the upper bound of iteration numbers and necessity to set the iteration numbers. Here we proposed a new Mean Shift algorithm, called Modified Mean Shift(MMS), which can overcome the all-defined drawbacks of the Mean shift algorithm. The MMS takes the KD-tree data structure to sort the dataset and sets all data points as initial cluster centroids without random selection. Then every iteration, it shifts the variable bandwidth sliding windows to the actual data point nearest to the calculated mean using Euclidean Distance algorithm and by this way find the number of clusters automatically. This paper handles the missing values problem using k-Nearest Neighbor Imputation (kNNI) method. The MMS algorithm produces accurate result than Mean shift on both synthetic and real datasets testing.