The International Arab Journal of Information Technology (IAJIT)


An Anti-Spam Filter Based on One-Class IB Method in Small Training Sets

We present an approach to email filtering based on one-class Information Bottleneck (IB) method in small training sets.When themes of emails are changing continually, the available training set which is high-relevant to the current theme will be small. Hence, we further show how to estimate the learning algorithm and how to filter the spam in the small training sets. First, In order to preserve classification accuracy and avoid over-fitting while substantially reducing trainingset size, we consider the learning framework as the solution of one-class centroid onlyaveraged by highly positive emails, and second, we design a simple binary classification model to filters spam by the comparison of similarity between emails and centroids. Experimental results show that in small training sets our method can significantly improve classification accuracy compared with the currently popular methods, such as: Naive Bayes, AdaBoost and SVM.

Chen Yangreceived his BE and ME degreesfrom the School of InformationEngineering, Zhengzhou University.Currently, he isa PhD candidateinSchool of Information,RenminUniversityof China,China andis alsoanassistantin School of Software Engineering at Zhengzhou University of Light Industry, China. His research interests includemachine learning andBigDatasystem. Shaofeng Zhaoreceived his BE and ME degreesfrom the School of Information Engineering, Zhengzhou University.Currently, heis an Assistantin library at Henan University of Economics and Law, China. His research interests include cloud computing and cloud storage. Dan Zhangreceived her BE degree from the school of computer, Henan University of Economicsand Law, and ME degree from the School of Information Engineering, Zhengzhou University.Currently, sheis anEngineerin Geophysical Exploration Center of China Earthquake Administration. Her research interests includecomplex system and machine learning. An Anti-SpamFilter Based on One-Class IB Method in Small Training Sets685 Junxia Mareceived her ME degree from Zhengzhou University. Currently, sheis a lecturer in the School of Software Engineering at Zhengzhou University of Light Industry, China. Her research interests includeartificial intelligence, data mining