The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Towards Achieving Optimal Performance using Stacked Generalization Algorithm: A Case Study of

The birth of data mining has been a blessing to all fields of endeavours and there are numerous data mining algorithms available today. One of the major problems of mining data is the selection of the appropriate algorithm or model for a job at hand; this has led to different comparison experiments by researchers. Stacked Generalization is one of the methods of combining multiple models to give a better accuracy. The method has been investigated to be effective by many researchers over the years. This study investigates how optimal performance could be achieved using Stacked Generalization algorithm. Six different data mining algorithms (PART, REP Tree, J48, Random Tree, RIDOR and JRIP) arranged in two different orders were used as base learners to two different Meta Learners (Random Forest and NNGE) independently and the results obtained were compared in terms of classification accuracy. The study shows that the order of arrangement of the base learners and the choice of Meta Learner could affect the accuracy of the Stacked Generalization method; NNGE outperforms Random Forest as a Meta-Learner and its performance is independent of the order of arrangement of the base learners as against Random Forest. Malaria fever datasets collected from reputable hospitals in Ado-Ekiti, Ekiti State, Nigeria were purposefully used for this study because malaria is one of the major diseases killing almost a million people yearly in the tropical region of Africa, so a more accurate malaria fever diagnosis model is as well proposed as a result of this study.


[1] Abbasimehr H., Sefak M., and Tarokh M., “A Comparative Assessment of the Performance of Ensemble Learning in Customer Churn Prediction,” The International Arab Journal of Information Technology, vol. 11, no. 6, pp. 599- 606, 2014.

[2] Ali J., Khan R., and Ahmad N., “Random Forests and Decision Tree,” International Journal of Computer Science Issues, vol. 9, no. 5, 2012.

[3] Datta R. and Saha S., “An Empirical Comparison of Rule Based Classification Techniques in Medical Databases,” Indian Institute of Foreign Trade, 2011.

[4] Devasena L., Sumathi T., Gomathi V., and Hemalatha M., “Effectiveness Evaluation of Rule Based Classifiers for the Classification of Iris Data Set,” Bonfring International Journal of Man Machine Interface, vol. 1, pp. 5-9, 2011.

[5] Fan D., Chan K., and Salvatore J., “A Comparative Evaluation of Combiner and Stacked Generalization,” in Proceedings of AAAI-96 Workshop on Integrating Multiple Learned Models, pp. 40-46, 1996.

[6] Hall M., Frank E., Holmes G., Pfahringer B., Reutemann P., and Witten I., “The WEKA Data Mining Software: An Update,” SIGKDD Explorations, vol. 11, no. 1, pp. 10-18, 2009.

[7] Karahoca A., Karahoca D., and Aydin N., “Benchmarking the Data Mining Algorithms with Adaptive Neuro-Fuzzy Inference in System GSM Churn Management,” Data Mining and Knowledge Discovery in Real Life Applications, Vienna, pp. 230-241, 2009.

[8] Keleş A., “Expert Doctor Verdis: Integrated Medical Expert System,” Turkish Journal of Electrical Engineering and Computer Sciences, vol. 22, pp. 1032-1043, 2014.

[9] Mahajan A. and Ganpati A., “Performance Evaluation of Rule Based Classification Algorithms,” International Journal of Advanced Research in Computer Engineering and Technology, vol. 3, no. 10, pp. 3546-3550, 2014.

[10] Parkhi V., Pawar P., and Surve A., “Computer Automation for Malaria Parasite Detection using Linear Programming,” International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, vol. 2, no. 5, pp. 1984-1988, 2013.

[11] Polikar R., Ensemble Learning, Springer, 2008.

[12] Ravichandran S., Srinlvasan V., and Ramasamy C., “Comparative Study on Decision Tree Techniques for Mobile Call Detail Record,” Journal of Communication and Computer, vol. 9, pp. 1331-1335, 2012.

[13] Soomro A., Memon N., and Menopn M., “Knowledge Based Expert System for Symptomatic Automated HealthCare,” Sindh University Research Journal (Science Series), vol. 43, no. 1-a, pp. 79-84, 2011.

[14] Tarun I., Gerardo B., and Tanguilig B, “Generating Licensure Examination Performance Models Using PART and JRIP Classifiers: A Data Mining Application in Education,” International Journal of Computer and Communication Engineering, vol. 3, no. 3, pp. 203-207, 2014.

[15] Tatsis V., Tjortjis C., and Tzirakis P., “Evaluating Data Mining Algorithms using Molecular Dynamics Trajectories,” International Journal Data Mining and Bioinformatics, vol. 8, no. 2, pp. 169-187, 2013.

[16] Thangaraj M. and Vijayalakshmi C., “Performance Study on Rule-Based Classification Techniques across Multiple Database Relations,” International Journal of Applied Information Systems, vol. 5, no. 4, pp. 1- 7, 2013.

[17] Ting K. and Witten I., “Stacking Bagged and Dagged Models,” in Proceedings of the Fourteenth International Conference on Machine Learning, San Francisco, pp. 367-375, 1997.

[18] Ting K. and Witten I., “Stacked Generalization: When does it Work?,” in Proceedings of the 15th International Joint Conference on Artificial Intelligence, pp. 866-871, 1997.

[19] Ting K., Low B., and Witten I., “Learning from Batched Data: Model Combination vs Data Combination,” Knowledge and Information Systems, vol. 1, no.1, pp. 83-106, 1999.

[20] Todorovski L. and Džeroski S., “Combining Classifiers with Meta Decision Trees,” Machine Learning, vol. 50, no. 3, pp. 223-249, 2003.

[21] Veeralakshmi V. and Ramyachitra D., “Ripple Down Rule Learner (RIDOR) Classifier for IRIS Dataset,” International Journal of Computer Science Engineering, vol. 4, no. 3, pp. 79-85, 2015.

[22] Vilalta R., Giraud-Carrier C., Brazdil P., and Soares C., “Using Meta-Learning to Support Data Mining,” International Journal of Computer Science and Applications, vol. 1, no. 1, pp. 31- 45, 2004.

[23] Williams S. and Olatunji O., “Hybrid Intelligent System for the Diagnosis of Typhoid Fever,” Journal of Computer Engineering and Information Technology, vol. 2, no. 2, pp. 1-9, 2013.

[24] Zhao Y. and Zhang Y., “Comparison of Decision Tree Methods for Finding Active Objects,” Towards Achieving Optimal Performance using Stacked Generalization ... 1081 Advances of Space Research, vol. 41, no. 12, pp. 1955-1959, 2007. Abiodun Oguntimilehin is a Senior Lecturer in the Department of Computer Science, Afe Babalola University, Nigeria. He obtained Ph.D in Computer Science from the Federal University of Technology, Akure, Nigeria. He is a chartered member, Computer Professionals (Registration Council of Nigeria) and Nigeria Computer Society. His research interests are Medical Informatics, Artificial Intelligence and Machine Learning. He has a number of publications in both reputable local and international journals. Olusola Adetunmbi is a Professor in the department of Computer Science, Federal University of Technology, Akure, Nigeria. He obtained a Ph.D degree in Computer Science from the Federal University of Technology, Akure, Nigeria. His Research interests are Information Security, Machine Learning and Natural Language Processing. He is a member of IEEE Computer Society, International Studies on Advanced Intelligence, Computer Professionals (Registration Council of Nigeria) and Nigeria Computer Society. He has a number of publications in both reputable local and international journals. Innocent Osho is a Professor of Veterinary Parasitology and ethno- veterinary medicine in the Department of Animal Production and Health, Federal University of Technology, Akure, Nigeria. He obtained PhD in Animal Parasitology from the Federal University of Technology, Akure, Nigeria. He is a member of Nigeria Veterinary Medicine Association and European Phyto-chemical Association among others. He has over 63 publications in some reputable local and international journals.