Improved Mean Shift Algorithm for Maximizing Clustering Accuracy

Publication Date - 2021-02-01 00:00:00

Publication Title : Improved Mean Shift Algorithm for Maximizing Clustering Accuracy

Publicationed By : Chinmay Bepery

Publication Publication Date : 2021-02-01 00:00:00

Clustering is a machine learning method that can group similar data points. Mean Shift (MS) is a fixed window-based clustering   algorithm,  which calculates the number of clusters automatically but cannot guarantee the convergence of the algorithm. The  main   drawback of the Mean Shift Algorithm is that the algorithm requires to set a stopping criterion (threshold point) otherwise all clusters   move towards one cluster and fixed bandwidth is used here. It cannot define the upper bound of iteration numbers and need to set the   iteration numbers. This paper proposed a new Mean Shift Algorithm, called Improved Mean Shift (IMS) algorithm, which overcomes   the all defined pitfalls of Mean Shift Algorithm. The IMS process KD-tree data structure was used to sort the dataset and all data points   as  initial  cluster  centroids  without  a  random  selection  of  initial  centroids.  In  each  iteration,  it  shifts  the  variable  bandwidth  sliding   window  to  the  actual  data  point  nearest  to  the  mean  using  k-nearest  neighbours  (kNN)  algorithm  and  finds  the  number  of  clusters   automatically. Also, this paper handles the missing values using Mean Imputation (MI). The IMS algorithm produces better results than the Mean Shift Algorithm on both synthetic and real datasets. 

