The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Named Entity Recognition for Automated Test Case Generation

Testing is the process of evaluating a software or hardware against its requirement specification. It helps to verify and grade a given system. Recent emphasis on Test Driven Development (TDD) has increased the need for testing from the early stages of software development. System test cases can be obtained from a number of user specifications such as functional requirements; UML diagrams and use case specification. This paper focuses on automating the test process from the early stages of requirement elicitation in the development of software. It describes a semi-supervised technique to generate test cases by identifying named entities in the given set of use cases. The named entities along with flow listing of the use cases serves as the source for scenario matrix from which a number of test cases can be obtained for a given scenario. The Named Entity Recognizer (NER) is trained by a set of features extracted from the use cases. The automated generation of entity list was found to increase the efficiency of the overall system.


[1] Ali S., Briand L., Hemmati H., and Panesar- Walawege R., A Systematic Review of the Application and Empirical Investigation of Search-Based Test Case Generation, IEEE Transactions on Software Engineering, vol. 36, no. 6, pp. 742-762, 2010.

[2] Arafeen M. and Do H., Test Case Prioritization Using Requirements-Based Clustering, in Proceedings of IEEE Sixth International Conference on Software Testing, Verification and Validation, Luxembourg, pp. 312-321, 2013.

[3] Atkinson J. and Bull V. A Multi-Strategy Approach to Biological Named Entity Recognition, Expert Systems with Applications, vol. 39, no. 17, pp. 12968-12974, 2012.

[4] Benajiba Y., Diab M., and Rosso P., Using Language Independent and Language Specific Features to Enhance Arabic Named Entity Recognition, The International Arab journal of 118 The International Arab Journal of Information Technology, Vol. 15, No. 1, January 2018 Information Technology, vol. 6, no. 5, pp. 463- 471, 2009.

[5] Bergadano F., Test Case Generation by Means of Learning Techniques, in Proceedings of the 1st ACM SIGSOFT symposium on Foundations of Software Engineering, Los Angeles, pp. 149-162, 1993.

[6] Bhakkad A., Dharamadhikari S-C., and Kulkarni P., Efficient Approach to Find Bigram Frequency in Text Document Using EVSM, International Journal of Computer Applications, vol. 68, no. 19, pp. 9-11, 2013.

[7] Binder R., Testing Object-Oriented Systems: Models, Patterns, and Tools, Addison-Wesley Professional, 2000.

[8] Bowring J., Rehg J., and Harrold M., Active Learning for Automatic Classification of Software Behavior, ACM SIGSOFT Software Engineering Notes, vol. 29, no. 4, pp. 195-205, 2004.

[9] Chen Y., Constructing Language Model by Using Data Mining Technique, Theses, The University of Hong Kong, 2004.

[10] DeSantiago V. and Vijaykumar N., Generating Model-Based Test Cases from Natural Language Requirements for Space application software, Software Quality Journal, vol. 20, no. 1, pp. 77-143, 2012.

[11] Derderian K., Hierons R., Harman M., and Guo Q., Automated Unique Input Output Sequence Generation for Conformance Testing of FSMs, The computer Journal, vol. 49, no. 3, pp. 331- 344, 2006.

[12] Dickinson W., Leon D., and Fodgurski A., Finding Failures by Cluster Analysis of Execution Profiles, in Proceedings of Software Engineering, Toronto, pp. 339-348, 2001.

[13] Etter D., Ferraro F., Cotterell R., and Buzek O., Nerit: Named Entity Recognition for Informal Text, Technical Report 11, Human Language Technology Center of Excellence, 2013.

[14] Fesharaki M., Shirazi H., and Bakhshi A., A Knowledge Acquisition from Database of Information Management and Documentation Softwares by Data Mining Techniques, Information Sciences and Technology, vol. 26, no. 2, pp. 259-283, 2011.

[15] Grishman R. Information Extraction: Techniques and Challenges, in Proceedings of International Summer School on Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology, Frascati, pp. 10-27 1997.

[16] Guti rrez J., Escalona M., Mej as M., and Torres J., Generation of Test Cases from Functional Requirements, A survey in 4 Workshop on System Testing and Validation, 2006.

[17] Halkidi M., Spinellis D., Tsatsaronis G., and Vazirgiannis M., Data Mining in Software Engineering, Intelligent Data Analysis, vol. 15, no. 3, pp. 413-441, 2011.

[18] Harman M., Mansouri S., and Zhang Y., Search-Based Software Engineering: Trends, Techniques and Applications, ACM Computing Surveys, vol. 45, no. 1, pp. 1-64, 2012.

[19] Harman M. and McMinn P., A Theoretical and Empirical Study of Search Based Testing: Local, Global and Hybrid Search, IEEE Transactions on Software Engineering, vol. 36, no. 2, pp. 226- 247, 2010.

[20] Hayes J., Dekhtyar A., and Sundaram S., Text Mining for Software Engineering: How Analyst Feedback Impacts Final Results, In ACM SIGSOFT Software Engineering Notes, vol. 30, no. 4, pp. 1-5, 2005.

[21] Heumann J., Generating Test Cases from Use Cases, the Rational Edge, 2001.

[22] Ismail N., Ibrahim, R., and Ibrahi N., Automatic Generation of Test Cases from Use-Case Diagram, in International Conference on Electrical Engineering and Informatics, Teknologi, pp. 17-19, 2007.

[23] Kim S., Toutanova K., and Yu H., Multilingual Named Entity Recognition Using Parallel Data and Metadata from Wikipedia, in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, JejuIslan, pp. 694- 702, 2012.

[24] Kusiak A., Kernstine K., Kern J., Melaughlin K., and Tseng T., Data Mining: Medical and Engineering Case Studies, in Proceedings of the Industrial Engineering Research Conference, Cleveland, pp. 1-7, 2000.

[25] Lakhotia K., Tillmann N., Harman M., and de Halleux J., Flopsy-Search-Based Floating Point Constraint Solving for Symbolic Execution, in Proceedings of the 23rd IFIP International Conference on Testing Software and Systems, Natal, pp. 142-157, 2010.

[26] Last M., Friedman M., and Kandel A., The Data Mining Approach to Automated Software Testing, in Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, pp. 388-396, 2003.

[27] Liao S., Chu P., and Hsiao P., Data Mining Techniques and Applications-A Decade Review from 2000 to 2011, Expert Systems with Applications, vol. 39, no.12, pp. 11303-11311, 2012.

[28] Liu X., Zhang S., Wei F., and Zhou M., Recognizing Named Entities in Tweets, in Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Named Entity Recognition for Automated Test Case Generation 119 Human Language Technologies, Portland, pp. 359-367, 2011.

[29] Maynard D., Tablan V., and Ursu C., Named Entity Recognition from Diverse Text Types, in Proceedings of the Recent Advances in Natural Language Processing, TzigovChark, pp. 257- 274, 2001.

[30] McMinn P., Harman M., Hassoun Y., Lakhotia K., and Wegener J., Input Domain Reduction Through Irrelevant Variable Removal and its Effect on Local, Global and Hybrid search-based Structural Test Data Generation, IEEE Transactions on Software Engineering, vol. 38, no. 2, pp. 453-477, 2012.

[31] McMinn P., Shahbaz M., and Stevenson M., Search-Based Test Input Generation for String Data Types Using the Results of Web Queries, in Proceedings of the 5th International Conference on Software Testing Verification and Validation, Montreal, 2012.

[32] Mei D. and ZhangX., Data Mining Sechniques for Structure of Single XML Document, Journal of Petrochemical Universities, vol. 20, no. 1, pp. 94-98, 2007.

[33] Michael C., McGraw G., and Schatz M., Generating Software Test Data by Evolution, IEEE Transactions on Software Engineering, vol. 27, no. 12, pp. 1085-1110, 2001.

[34] Miguel G., Albert E., and Puebla G., Test Case Generation for Object-Oriented Imperative Languages in Clp, Theory and Practice of Logic Programming, vol. 10, no. 4-6, pp. 659-674, 2010.

[35] Mikheev A., Moens M., and Grover C., Named Entity Recognition Without Gazetteers, in Proceedings of the 9th Conference on European Chapter of the Association for Computational Linguistics, Bergen, pp. 1-8, 1999.

[36] Nadeau D. and Sekine S., A Survey of Named Entity Recognition and Classification, Lingvisticae Investigationes, vol. 30, no. 1, pp. 3-26, 2007.

[37] Ngai E., Xiu L., and Chau D., Application of Data Mining Techniques in Customer Relationship Management: A Literature Review and Classification, Expert Systems With Applications, vol. 36, no. 2, pp. 2592-2602, 2009.

[38] Nothman J., Ringland N., Radford W., Murphy T., and Curran J., Learning Multilingual Named Entity Recognition from Wikipedia, Artificial Intelligence, vol. 194, pp. 151-175, 2013.

[39] Patrick J. and Wang Y., Biomedical Named Entity Recognition System, in Proceedings of the 10th Australasian Document Computing Symposium, Sydney, 2005.

[40] Raamesh L. and Uma G., Knowledge Mining of Test Case System, International Journal on Computer Science and Engineering, vol. 2, no. 1, pp. 69-73, 2010.

[41] Radosavljevic V., Vucetic S., and Obradovic Z., A Data-Mining Technique for Aerosol Retrieval Across Multiple Accuracy Measures, IEEE Geoscience and Remote Sensing Letters, vol. 7, no. 2, pp. 411-415, 2010.

[42] Ritteri A., Clarki S., and Etzioni O., Named Entity Recognition in Tweets: An Experimental Study, in Proceedings of the Conference on Empirical Methods in Natural Language Processing, Edinburgh, pp. 1524-1534, 2011.

[43] Romero C. and Ventura S., Educational Data Mining: a Review of the State of the Art, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol. 40, no. 6, pp. 601-618, 2010.

[44] Song Y., Yi E., Kim E., Lee G., and Park S., POSBIOTM-NER: a Machine Learning Approach for Bio-Named Entity Recognition, in Proceedings the EMBO Workshop on Critical Assessment of Text Mining Methods in Molecular Biology, 2004.

[45] Souza M., Borge M., D Amorim M., and Pasareanu C., CORAL: Solving Complex Constraints for Symbolic Pathfinder, in Proceedings of Third International Symposium NASA Formal Methods, Pasadena, pp. 359-374, 2011.

[46] Swain S., Mohapatra D., and Mall R., Test Case Generation Based on Use Case and Sequence Diagram, International Journal of Software Engineering, vol. 3, no. 2, pp. 21-52, 2010.

[47] Tonella P., Evolutionary Testing of Classes, in Proceedings of ACM SIGSOFT International Symposium on Software Testing and Analysis, Boston, pp. 119-128, 2004.

[48] Wasan S., Bhatnagar V., and Kaur H., The Impact of Data Mining Techniques on Medical Diagnostics, Data Science Journal, vol. 5, pp. 119-126, 2006.

[49] Wegener J. and Buhler O., Evaluation of Different Fitness Functions for the Evolutionary Testing of an Autonomous Parking System, in Proceedings of Genetic and Evolutionary Computation Conference, Seattle, pp. 1400-1412, 2004.

[50] Wegener J. and Grochtmann M., Verifying Timing Constraints of Real-Time Systems by Means of Evolutionary Testing, Real-Time Systems, vol. 15, no. 3, pp. 275 -298, 1998.

[51] Witte R., Li Q., Zhang Y., and Rilling J., Ontological Text Mining of Software Documents, in Proceedings of International Conference on Application of Natural Language to Information Systems, Paris, pp. 168-180, 2007. 120 The International Arab Journal of Information Technology, Vol. 15, No. 1, January 2018

[52] Yang L. and Zhou Y., Exploring Feature Sets for Two-Phase Biomedical Named Entity Recognition Using Semi-CRFs, Knowledge and Information Systems, vol. 40, no. 2, pp. 1-15, 2014.

[53] Yuehua D. and Jidong P., Automatic Generation of Software Test Cases Based on Improved Genetic Algorithm, in Proceedings of International Conference on Multimedia Technology, Hangzhou, pp. 227-230, 2011.

[54] Zhang C. and Marquez J., Approximation of Minimal Cut sets for a Flow Network Via Evolutionary Optimization and Data Mining Techniques, International Journal of Performability Engineering, vol. 7, no. 1, pp. 21- 31, 2011. Guruvayur Mahalakshmi is an Assistant Professor (Senior Grade) in the Department of Computer Science and Engineering, College of Engineering, Anna University, Chennai. She completed her B.E. (Computer Science and Engineering) from R.V.S. College of Engineering and Technology, Dindigul and M.E. (Computer Science and Engineering) and Ph.D. from College of Engineering, Anna University, Chennai. She has numerous international journal and conference publications to her credit. She is also the author of Tamil Edition of B.E. course - text books - Fundamentals of Computing and Computer Practice of Anna University. She has authored many book chapters and derives 100+ citations to her credit. Her research interests include Reasoning, Knowledge Sharing and representation, Text Mining, Social Network Analysis, bibliometrics, and Natural Language Computing. Vani Vijayan is a Senior Assistant Professor in Department of Information Technology in Easwari Engineering College, Anna University, Chennai, Tamilnadu. She completed her Masters from Anna University in 2009 in Computer Science and Engineering and Bachelors from Bharathiar University, Coimbatore, TamilNadu in 2002 in Information Technology. She is currently working towards pursuing Ph.D.degree from Anna University, Chennai in Faculty of Information and Communication Engineering, registered in July 2010. She has experience of 10 years in the field of teaching. She has guided around 20 UG/PG projects and has published few papers in National and International Conferences. Her primary research interest is Natural Language Processing, Text mining and Software Engineering. Betina Antony is a Research Scholar in the Department of Computer Science and Engineering in Anna University, Chennai, Tamilnadu, India. She finished her Bachelors (Computer Science and Engineering) in Sri Sivasubramania Nadar College of Engineering and her Post graduation (Software Engineering) in College of Engineering, Guindy, Anna University, in which she secured gold medal for being the first rank holder. She has presented many papers in national and international conferences. She is currently working on Named Entity Recognition for Tamil Biomedical texts. Her research interests are Natural Language Processing, Text and Data mining.