A Bi-Level Text Classification Approach for SMS Spam Filtering and Identifying Priority Messages

Short Message Service (SMS) traffic is increasing day by day and trillions of sms are sent and received by billions of users every day. Spam messages are also increasing in same proportionate. Numbers of recent advancements are taking place in the field of sms spam detection and filtering. The objective of this work is twofold, first is to identify and classify spam messages from the collection of sms messages and second is to identify the priority or important sms messages from the filtered non-spam messages. The objective of the work is to categorize the sms messages for effective management and handling of sms messages. the work is planned in two level of binary classification wherein at the first level of classification the sms messages are categorized into the two classes spam and non-spam using popular binary classifiers, and then at the second level of classification non-spam sms messages are further categorized into the priority and normal sms messages. four state of the art popular text classification techniques namely, Naïve Bayes (NB), Support Vector Machine (SVM), Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) are used to categorize the sms text message at different levels of classification. The proposed bi-level classification model is also evaluated using the performance measures accuracy and f- measure. Combinations of classifiers at both levels are compared and it is shown from the experiments that SVM algorithm performs better for filtering the spam messages and categorizing the priority messages.


[44] Yoon J., Kim H., and Huh J., "Hybrid Spam Filtering for Mobile Communication," Computers and Security, vol. 29, no. 4, pp. 446- 459, 2010. Naresh Kumar Nagwani has completed his graduation in Computer Science and Engineering in 2001 from G. G. Central University, Bilaspur. He completed his post-graduation Master of Technology in Information Technology from ABV-Indian Institute of Information Technology, Gwalior in 2005 and completed the Ph.D. in Computer Science and Engineering in 2013 from National Institute of Technology Raipur, India. His area of interest is data mining, text mining, mining software repositories and information retrieval. His employment experience includes Software Developer and Team Lead at Persistent Systems Limited and presently Assistant Professor at NIT Raipur. He has published more than 20 research papers in various journals and conferences.