The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


A Dynamic Linkage Clustering using KD-Tree

 Some  clustering  algorithms  calculate  connectivity  o f  each  data  point  to  its  cluster  by  depending  on  de nsity  reachability.  These  algorithms  can  find  arbitrarily   shaped  clusters,  but  they  require  parameters  that  are  mostly  sensitive  to  clustering  performance.  We  develop  a  new  dynamic  li nkage  clustering  algorithm  using  kd-tree.  The  proposed  algorithm  does  not  require  any  parameters  and  does  not  have  a  wors t-case  bound  on  running  time  that  exists  in  many  similar  algorithms  in  the  literature.  Experimental  results  are  shown  in  t his  paper  to  demonstrate  the  effectiveness  of  the  p roposed  algorithm.  We  compare  the  proposed  algorithm  with  other  famous  si milar  algorithm  that  is  shown  in  literature.  We  present  the  proposed  algorithm and its performance in detail along with  promising avenues of future research.   


[1] Abu-Abbas O., Comparisons Between Data Clustering Algorithms, The International Arab Journal of Information Technology , vol. 5, no. 3, pp. 320-325, 2008.

[2] Agarwal P., Alam M., and Biswas R., Analysing the Agglomerative Hierarchical Clustering Algorithm for Categorical Attributes, International Journal of Innovation, Management and Technology , vol. 1, no. 2, pp. 186-190, 2010.

[3] Asuncion A. and Newman D., UCI Repository of Machine Learning Database, available at: http://www.ics.uci.edu/~mlearn/MLrepository. html, last visited 2007.

[4] Ester M., Kriegel H., Sander J., and Xu X., A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, in Proceedings of the 2 nd International Conference on Knowledge Discovery and Data Mining , Portland, pp. 226-231,1996.

[5] Gan G., Ma C., and Wu J., Data Clustering: Theory, Algorithms, and Applications , ASA- SIAM Series on Statistics and Applied Probability, SIAM, Philadelphia, Alexandria, 2007.

[6] Halgamuge S. and Wang L., Classification and Clustering for Knowledge Discover , Springer- Verlag Berlin Heidelberg, New York, 2005.

[7] Hammerly G. and Elkan C., Alternatives to the k-Means Algorithm That Find Better Clusterings, in Proceedings of the 11 th International Conference on Information and Knowledge Management , USA, pp. 600-607, 2002.

[8] Kogan J., Introduction to Clustering Large and High-Dimensional Data , Cambridge University Press, New York, 2007.

[9] Kogan J., Teboulle M., and Nicholas C., Grouping Multidimensional Data , Springer- Verlag, New York, 2006.

[10] Kriegel H., Kr ger P., Sander J., and Zimek A., Density-Based Clustering, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , vol. 1, no. 3, pp. 231-240, 2011.

[11] Lin N., Chang C., Jan N., Chen H., and Hao W., A Deflected Grid-Based Algorithm for Clustering Analysis, International Journal of Mathematical Models and Methods in Applied Sciences , vol. 1, no. 1, pp. 33-39, 2007.

[12] Mount D. and Arya S., ANN: A Library for Approximate Nearest Neighbor Searching, available at: http://www.cs.umd.edu/~mount/ ANN, last visited 2005.

[13] Panigrahy R., An Improved Algorithm Finding Nearest Neighbor Using KD-Trees, in Proceedings of the 8 th Latin American Conference on Theoretical Informatics , Berlin, pp. 387-398, 2008.

[14] Sato-Ilic M. and Jain L., Innovations in Fuzzy Clustering , Springer-Verlag, New York, 2006.

[15] Theodoridis S. and Koutroumbas K., Pattern Recognition , Elsevier Academic Press, Amsterdam, 2003.

[16] Valente J. and Pedrycz W., Advances in Fuzzy Clustering and its Applications , John Wiley & Sons Ltd, England, 2007.

[17] Williams W. and Lambert J., Multivariate Methods in Plant Ecology: V. Similarity Analyses and Information-Analysis, Journal of Ecology , vol. 54, no. 2, pp. 427-445, 1966. Shadi Abudalfa received his BSc and MSc degrees both in computer engineering from the Islamic University of Gaza, Palestine in 2003 and 2010 respectively. He is a lecturer at the University Collage of Applied Sciences, Palestine. From July 2003 to August 2004, he worked as a research assistant at projects and research Lab in IUG. From February 2004 to August 2004, he worked as a teaching assistant at Faculty of Engineering in I UG. Mohammad Mikki is a professor of computer engineering at Islamic University of Gaza, Palestine. His general research interests are in high performance parallel and distributed computing, high speed computer networks and communications protocols and computer networks management, modeling and design of digital computer systems, internet technology and programming, internet performance measurement tools, and web-based learning.