..............................

..............................

..............................

# Kernel Logistic Regression Algorithm for Large Scale Data Classification

Kernel Logistic Regression (KLR) is a powerful clas sification technique that has been applied successfully in many
classification problems. However, it is often not f ound in large+scale data classification problems an d this is mainly because it
is computationally expensive. In this paper, we pre sent a new KLR algorithm based on Truncated Regular ized Iteratively Re+
weighted Least Squares(TR+IRLS) algorithm to obtain sparse large+scale data classification in short evolution time. This new
algorithm is called Nystrom Truncated Kernel Logist ic Regression (NTR+KLR). The performance achieved u sing NTR+KLR
algorithm is comparable to that of Support Vector M achines (SVMs) methods. The advantage is NTR+KLR ca n yield
probabilistic outputs and its extension to the mult i class case is well defined. In addition, its computational complexity is lower
than that of SVMs methods and it is easy to impleme nt.

[1] Canu S. and Smola A., Kernel Methods and the Exponential Family, Neurocomputing, vol. 69, no. 7, pp. 714+720, 2006.

[2] Chang C. and Lin C., LIBSVM: A Library for SVMs, available at: https://www.csie.ntu.edu.tw /~cjlin/libsvm, last visited 2001.

[3] Cristianini N. and Shawe+Taylor J., An Introduction to Support Vector Machines: And Other Kernel+Based Learning Methods , Cambridge University Press, Cambridge, 2000.

[4] Fan R., Chen P., and Lin C., Working Set Selection using Second Order Information for Training SVM, available at: http://www.jmlr.org/papers/volume6/fan05a/fan0 5a.pdf , last visited 2012.

[5] Glas A., Lijmer J., Prins M., Bonsel G., and Bossuyt P., The Diagnostic Odds Ratio: A Single Indicator of Test Performance, the Journal of Clinical Epidemiology , vol. 56, no. 11, pp. 1129+1135, 2003.

[6] Green P., Iteratively Reweigted Least Squares for Maximum Likelihood Estimation and Some Robust and Resistant Alternatives, Journal of the Royal Statistical Society, Series B , vol. 46, no. 2, pp. 149+192, 1984.

[7] Habib M., Hadria I., and Chahira S., Zernike Moments and SVM for Shape Classification in Very High Resolution Satellite Images, the International Arab Journal of Information Technology , vol. 11, no. 1, pp. 43+51, 2014.

[8] Hastie T., Tibshirani R., and Friedman J., The Elements of Statistical Learning , Springer, Berlin, 2001.

[9] Hosmer D. and Lemeshow S., Applied Logistic Regression , Wiley, London, 2000.

[10] Hsu C., Chang C., and Lin C., A Practical Guide to Support Vector Classification, Technical Report , National Taiwan University, 2010.

[11] Isselbacher K., Harrison s Principles of Internal Medicine , McGraw+Hill, USA, 1994.

[12] Jaakkola T. and Haussler D., Probabilistic Kernel Regression Models, in Proceedings of Conference on AI and statistics , Cambridge, UK, pp. 1+9, 1999.

[13] Karsmakers P., Sparse Kernel+Based Models for Speech Recognition, PhD Thesis, Katholieke Universiteit Leuven, Belgium, 2010.

[14] Karsmakers P., Pelckmans K., and Suykens J., Multi+Class Kernel Logistic Regression: A Fixed+Size Implementation, in Proceedings of the International Joint Conference on Neural Networks , FLorida, USA, pp. 1756+1761, 2007.

[15] Koh K., Kim S., and Boyd S., An Interior+Point Method for Large+Scale l1+regularized Logistic Regression, the Journal of Machine Learning Research , vol. 8, no. 8, pp. 1519+1555, 2007.

[16] Komarek P. and Moore A., Making Logistic Regression a Core Data Mining Tool with TR+ IRLS, in Proceedings of the 5 th International Conference on Data Mining , Washington, USA, pp. 685+688, 2005.

[17] Lin C., Weng R., and Keerthi S., Trust Region Newton Methods for Large+Scale Logistic Regression, in Proceedings of the 24 th International Conference on Machine Learning , New York, USA, pp. 561+568, 2007.

[18] Maalouf M., Theodore B., and Adrianto I., Kernel Logistic Regression using Truncated Newton Method, Computational Management Science , vol. 8, no. 4, pp. 415+428, 2010.

[19] Platt J., Fast Training of Support Vector Machines using Sequential Minimal Optimization, Advances in Kernel Methods: Support Vector Learning , 1999.

[20] Suykens J., Gestel T., De Brabanter J., De Moor B., and Vandewalle J., Least Squares Support Vector Machines , World Scientific Publishing, Singapore, 2002.

[21] Williams C. and Seeger M., Using the Nystrom Method to Speed up Kernel Machines, in Proceedings of the 14 th Annual Conference on 472 The International Arab Journal of Information Techn ology, Vol. 12, No. 5, September 2015 Neural Information Processing Systems, British Columbia, Canada, pp. 682+688, 2001.

[22] Youden W., Index for Rating Diagnostic Tests, Cancer , vol. 3, no. 1, pp. 32+35, 1950.

[23] Zhu J. and Hastie T., Kernel Logistic Regression and the Import Vector Machine, the Journal of Computational and Graphical Statistics , vol. 14, no. 1, pp. 1+8, 2005.

[24] Zhang K., Tsang I., and Kwok J., Improved Nystrom Lowrank Approximation and Error Analysis, in Proceedings of the 25 th International Conference on Machine Learning , Helsinki, Finland, pp. 1232+1239, 2008. Murtada Elbashir received the BSc degree in Computer/statistics from university of Gezira, Sudan, in 2000, The MSc degree in computer information systems from Free State University, Bloemfontein, South Africa, in 2003 and the PhD degree in computer science and technology in Central South University, China, in 2013. His current research in terest include: Machine learning and bioinformatics. Jianxin Wang received the BEng and MEng degrees in computer engineering from Central South University, China, in 1992 and 1996, respectively and the PhD degree in computer science from Central South University, China, in 2001. He is the chair of and a professor in Department of Computer Science, Central South University, China. His curre nt research interests include: Algorithm analysis and optimization, parameraized algorithm, bioinformatic s and computer network. He is a senior member of the IEEE.