A Measurement of Similarity to Identify Identical

Author Department of Information Technology, Bharathiar Un iversity, India,

Keywords #

Abstract Code clones are described as a part of the program which is completely or partially similar to the other portions. In the earlier research the code clones have been dete cted using fingerprinting technique. The major chal lenge in our work was to group the code clones based on similarity measur e. The proposed system measures the similarity based on similarity distance. The defined expression considers two para meters for calculating the similarity measure namely the similarity distance and the population of the clone. Thereby t he code clones are clustered and ranked on the basi s of their similarity measures. Indexing is used to interactively identif y the clones which are caused due to inconsistent c hanges. As a result of this work all the identical clusters for most similar an d more similar categories are identified.

References

[1] Abbas O., Comparisons Between Data Clustering Algorithms, the International Arab Journal of Information Technology , vol. 5, no. 3, pp. 3201325, 2008.

[2] Barbour L., Khomh F., and Zou Y., Late Propagation in Software Clones, in Proceedings of the 27th IEEE International Conference on Software Maintenance, Williamsburg, USA, pp. 2731282, 2011.

[3] Cordy R., Dean R., and Synytskyy N., Practical Language1Independent Detection Of Near1Miss Clones, in Proceedings of the 14th IBM Centre for Advanced Studies Conference , pp 1112, 2004.

[4] Gode N. and Koschke R., Studying Clone Evolution using Incremental Clone Detection, Journal of Software: Evolution and Process , vol. 25, no. 2, pp. 1651192, 2013.

[5] Hemel A., Kalleberg K., Vermaas R., and Dolstra., Finding Software License Violations Through Binary Code Clone Detection, in Proceedings of the 8th Working Conference on Mining Software Repositories, New York, pp. 63172, 2011.

[6] Koschke R., Large Scale Inter System Clone Detection using Suffix Trees and Hashing, Journal of Software: Evolution and Process , vol. 26, no. 8, pp. 7471769, 2013.

[7] Li Z., Shan L., Myagmar S., and Zhou Y., CP1 Miner: Finding Copy1Paste and Related Bugs in Large1Scale Software Code, IEEE Transactions on Software Engineering , vol. 32, no. 3, pp 1761 192, 2006.

[8] Miyamoto S. and Terami A., Constrained Agglomerative Hierarchical Clustering Algorithms with Penalties, in Proceedings of IEEE International Conference on Fuzzy Systems , Taipei, China, pp. 4221427, 2011.

[9] Mythili S., Sarala S., Enhanced Technique to Identify Higher Level Clones in Software, in Proceedings of the 2nd International Conference on Soft Computing and Problem Solving , pp. 117511182, 2012.

[10] Nguyen T., Nguyen H., Al1Kofahi J., Pham N., and Nguyen T., Scalable And Incremental Clone Detection for Evolving Software, in Proceedings of International Conference on Software Maintenance , Edmonton, pp 4911494, 2009.

[11] Roy C. and Cordy J., A Survey on Software Clone Detection Research, available at: http://maveric0.uwaterloo.ca/~migod/846/papers/ roy1CloningSurveyTechReport.pdf, last visited 2007.

[12] Roy C. and Cordy J., NICAD: Accurate Detection of Near1Miss Intentional Clones Using Flexible Pretty Printing and Code Normalization, in Proceedings of the 16th International Conference on Program Comprehension , Amsterdam, pp. 1721181, 2008.

[13] Roy C., Cordy J., and Koschke R., Comparison and Evaluation of Code Clone Detection Techniques and Tools: A Qualitative Approach, Journal Science of Computer Programming, vol. 74, no. 7, pp. 4701495, 2009 .

[14] Schwarz N., Lungu M., and Robbes R., On How Often Code is Cloned Across Repositories, in Proceedings of the in Proceedings of the 34th International Conference on Software Engineering, pp. 128911292, 2012.

[15] Smith R. and Horwitz S., Detecting and Measuring Similarity in Code Clones, available at: http:// research.cs.wisc.edu/ wpis/ papers/codeClonesWorkshop09.pdf, last visited 2009.

[16] Yamashina T., Uwano H., Fushida K., Kamei Y., Nagura M., Kawaguchi S., Iida H., SHINOBI: A Real1Time Code Clone Detection Tool For Software Maintenance, in Proceedings of the 16th Working Conference on Reverse Engineering , Lille, French, pp 3131314, 2009.

[17] Yoshioka S., Yoshida N., Fushida K., and Iida H., Scalable Detection of Semantic Clones Based on Two1Stage Clustering, available at: Threshold Value Methods Files Directory Clone Sets Identical Clusters Clone Sets Identical Clusters Clone Sets Identical Clusters ES MS ES MS ES MS ES MS ES MS ES MS ES MS 0.810.85 0.6 0.7 57 57 25 23 110 0 26 0 2 0 1 0 0.8510.9 0.710.75 54 42 22 19 99 3 18 2 2 0 1 0 0.911.00 0.7510.8 53 43 14 16 90 3 12 1 2 0 1 0 M2 M12 M10 M9 M7 M6 M1 740 The International Arab Journal of Information Techn ology, Vol. 12, No. 6A, 2015 http://sdlab.naist.jp/pman3/pman3.cgi?DOWNL OAD=50, last visited 2011. Mythili ShanmughaSundaram is a PhD Research Scholar in Bharathiar University, India. She is graduated with MCA, MPhil degree in computer science. She has published and presented papers in various Journals and Conferences. Her areas of interest include software engineering and softwa re testing. Sarala Subramani is a Assistant Professor, Department of Information Technology at Bharathiar University. She completed her PhD in object oriented software testing, Anna University, Chennai. She joined as a Junior Research Fellow in the Department of Computer Science and Engineering, Anna University in December 2001. She completed her B.Sc Physics in Quiad1E1Millath Women s College, affiliated to Madras University, Chennai and M.C.A in Computer Applications from Madras University, Chennai. She has a teaching and research experience of 9 years a nd has presented papers in various National and International Conferences. Her areas of interest in clude software testing, software engineering, object orie nted programming concepts, data structures and compiler design.