Semi-Supervised Kernel Discriminative Low-Rank Ridge Regression for Data Classification
Regression problem is currently a popular research topic in the field of machine learning. However, most existing research directly performs linear classification on the data after simple preprocessing, or performs classification after feature selection. It usually does not take into account the characteristics of the sample itself, especially for data that is linearly inseparable in original dimensional space and often produces unsatisfactory classification performance. Furthermore, the method of simply mapping the data into a kernel space using kernel trick before classification makes data classification more complex. It also results in unsatisfactory classification performance. In this paper, a simple yet effective semi-supervised Kernel discriminative Low-Rank Ridge Regression (KLRRR) model is proposed for data classification, which unifies kernel trick and discriminant subspace projection together. Specifically, the data is first mapped into kernel space to deal with the linear inseparability problem in original dimensional space, and then the projection matrix in the least square regression is decomposed into the product of two factor matrices to complete the joint discriminant subspace projection and regression. Experiments on 12 benchmark data sets show that the proposed KLRRR model greatly improves the classification performance in comparison with some state-of-the-arts.
[1] Bunke O., Droge B., and Polzehl J., “Model Selection, Transformations and Variance Estimation in Nonlinear Regression,” Statistics, vol. 33, no. 3, pp. 197-240, 1999. DOI:10.1080/02331889908802692
[2] Cai X., Ding C., Nie F., and Huang H., “On the Equivalent of Low-Rank Regressions and Linear Discriminant Analysis Based Regressions,” in Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, pp. 1124- 1132, 2013. https://doi.org/10.1145/2487575.2487701
[3] Chen X., Yuan G., Nie F., and Huang J., “Semi- Supervised Feature Selection via Rescaled Linear Regression,” in Proceedings of the 26th International Joint Conference on Artificial Intelligence, Melbourne, pp. 1525-1531, 2017. DOI:10.24963/ijcai.2017/211
[4] Coulston J., Blinn C., Thomas V., and Wynne R., “Approximating Prediction Uncertainty for Random Forest Regression Models,” Photogrammetric Engineering and Remote Sensing, vol. 82, no. 3, pp. 189-197, 2016. DOI:10.14358/PERS.82.3.189
[5] Daemen A. and De Moor B., “Development of a Kernel Function for Clinical Data,” in Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Minneapolis, pp. 5913-5917, 2009. DOI:10.1109/IEMBS.2009.5334847
[6] Dahmani M. and Guerti M., “Recurrence Quantification Analysis of Glottal Signal as non Linear Tool for Pathological Voice Assessment and Classification,” The International Arab Journal of Information Technology, vol. 17, no. 6, pp. 857-866, 2020. DOI:10.34028/iajit/17/6/4
[7] Elbashir M. and Wang J., “Kernel Logistic Regression Algorithm for Large-Scale Data Classification,” The International Arab Journal of Information Technology, vol. 12, no. 5, pp. 465- 472, 2015. https://www.iajit.org/portal/PDF/Vol%2012,%20 No.%205/6059.pdf
[8] Feng Q., Yuan C., Huang J., and Li W., “Center- Based Weighted Kernel Linear Regression for Image Classification,” in Proceedings of the IEEE International Conference on Image Processing, Quebec City, pp. 3630-3634, 2015. DOI:10.1109/ICIP.2015.7351481
[9] Fitzmaurice G., “Regression,” Diagnostic Histopathology, vol. 22, no. 7, pp. 271-278, 2016. DOI:10.1016/j.mpdhp.2016.06.004
[10] Frank I., “Modern Nonlinear Regression Methods,” Chemometrics and Intelligent Laboratory Systems, vol. 27, no. 1, pp. 1-19, 1995. DOI: 10.1016/0169-7439(95)80003-R
[11] Fukumizu K., Bach F., and Jordan M., “Kernel Dimension Reduction in Regression,” The Annals of Statistics, vol. 37, no. 4, pp. 1871-1905, 2009. DOI: 10.1214/08-AOS637
[12] Hartley H. and Booker A., “Nonlinear Least Squares Estimation,” The Annals of Mathematical Statistics, vol. 36, no. 2, pp. 638-650, 1965. DOI:10.1214/aoms/1177700171
[13] Kutateladze V., “The Kernel Trick for Nonlinear Factor Modeling,” International Journal of Forecasting, vol. 38, no. 1, pp. 165-177, 2022. DOI:10.1016/j.ijforecast.2021.05.002
[14] LaValley M., “Logistic Regression,” Circulation, vol. 117, no. 18, pp. 2395-2399, 2008. DOI:10.1161/CIRCULATIONAHA.106.682658
[15] Lu H., Meng Y., Yan K., and Gao Z., “Kernel Principal Component Analysis Combining Rotation Forest Method for Linearly Inseparable Data,” Cognitive Systems Research, vol. 53, pp. 111-122, 2019. DOI:10.1016/j.cogsys.2018.01.006
[16] Lunt M., “Introduction to Statistical Modelling: Linear Regression,” Rheumatology, vol. 54, no. 7, pp. 1137-1140, 2015. DOI:10.1093/rheumatology/ket146
[17] Marill K., “Advanced Statistics: Linear Regression, Part II: Multiple Linear Regression,” Academic Emergency Medicine, vol. 11, no. 1, pp. 94-102, 2004. DOI:10.1197/j.aem.2003.09.006
[18] Maulud D. and Abdulazeez A., “A Review on Linear Regression Comprehensive in Machine Learning,” Journal of Applied Science and Semi-Supervised Kernel Discriminative Low-Rank Ridge Regression for Data Classification 813 Technology Trends, vol. 1, no. 2, pp. 140-147, 2020. DOI:10.38094/jastt1457
[19] Meer P., Mintz D., Rosenfeld A., and Kim D., “Robust Regression Methods for Computer Vision: A Review,” International Journal of Computer Vision, vol. 6, no. 1, pp. 59-70, 1991. DOI:10.1007/BF00127126
[20] Moghaddam V. and Hamidzadeh J., “New Hermite Orthogonal Polynomial Kernel and Combined Kernels in Support Vector Machine Classifier,” Pattern Recognition, vol. 60, pp. 921- 935, 2016. DOI:10.1016/j.patcog.2016.07.004
[21] Naseem I., Togneri R., and Bennamoun M., “Linear Regression for Face Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 11, pp. 2106-2112, 2010. DOI:10.1109/TPAMI.2010.128
[22] Ostertagova E., “Modelling Using Polynomial Regression,” Procedia Engineering, vol. 48, pp. 500-506, 2012. DOI:10.1016/j.proeng.2012.09.545
[23] Peng Y., Ke J., Liu S., Li J., and Lei T., “An Improvement to Linear Regression Classification for Face Recognition,” International Journal of Machine Learning and Cybernetics, vol. 10, no. 9, pp. 2229-2243, 2019. https://link.springer.com/article/10.1007/s13042- 018-0862-1
[24] Peng Y., Zhang Y., Kong W., Nie F., Lu B., and Cichocki A., “S3LRR: A Unified Model for Joint Discriminative Subspace Identification and Semisupervised EEG Emotion Recognition,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1-13, 2022. DOI:10.1109/TIM.2022.3165741
[25] Peng Y., Zhu X., Nie F., Kong W., and Ge Y., “Fuzzy Graph Clustering,” Information Sciences, vol. 571, pp. 38-49, 2021. DOI:10.1016/j.ins.2021.04.058
[26] Prajapati G. and Patle A., “On Performing Classification Using SVM with Radial Basis and Polynomial Kernel Functions,” in Proceedings of the 3rd International Conference on Emerging Trends in Engineering and Technology, Goa, pp. 512-515, 2010. DOI: 10.1109/ICETET.2010.134
[27] Rajan K. and Murugesan V., “Hyperspectral Image Compression Based on DWT and TD with ALS Method,” The International Arab Journal of Information Technology, vol. 13, no. 4, pp. 435- 442, 2016. https://iajit.org/PDF/vol.13,%20no.4/7162.pdf
[28] Rochefort-Maranda G., “Simplicity and Model Selection,” European Journal for Philosophy of Science, vol. 6, no. 2, pp. 261-279, 2016. DOI:10.1007/s13194-016-0137-1
[29] Sahoo D., Hoi S., and Li B., “Large Scale Online Multiple Kernel Regression with Application to Time-Series Prediction,” ACM Transactions on Knowledge Discovery from Data, vol. 13, no. 1, pp. 1-33, 2019. DOI:10.1145/3299875
[30] Stulp F. and Sigaud O., “Many Regression Algorithms, one Unified Model: A Review,” Neural Networks, vol. 69, pp. 60-79, 2015. DOI:10.1016/j.neunet.2015.05.005
[31] Tong H., Chen D., and Peng L., “Analysis of Support Vector Machines Regression,” Foundations of Computational Mathematics, vol. 9, no. 2, pp. 243-257, 2009. DOI:10.1007/s10208- 008-9026-0
[32] Wahyudi T. and Arroufu D., “Implementation of Data Mining Prediction Delivery Time Using Linear Regression Algorithm,” Journal of Applied Science and Technology Trends, vol. 4, no. 1, pp. 84-92, 2022. DOI:10.37385/jaets.v4i1.918
[33] Wang W., Fang L., and Zhang W., “Robust Double Relaxed Regression for Image Classification,” Signal Processing, vol. 203, pp. 108796, 2023. DOI:10.1016/j.sigpro.2022.108796
[34] Xu M., Watanachaturaporn P., Varshney P., and Arora M., “Decision Tree Regression for Soft Classification of Remote Sensing Data,” Remote Sensing of Environment, vol. 97, no. 3, pp. 322- 336, 2005. DOI:10.1016/j.rse.2005.05.008
[35] Yang Y., Yang Y., Shen H., Zhang Y., Du X., and Zhou X., “Discriminative Nonnegative Spectral Clustering with Out-of-Sample Extension,” IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 8, pp. 1760-1771, 2013. DOI: 10.1109/TKDE.2012.118
[36] Yildiz K., Camurcu Y., and Dogan B., “Comparison of Dimension Reduction Techniques on High Dimensional Datasets,” The International Arab Journal of Information Technology, vol. 15, no. 2, pp. 256-362, 2018. https://www.iajit.org/PDF/March%202018%2C %20No.%202/9699.pdf
[37] Zhang H., Yang J., Qian J., Gao G., Lan X., Zha Z., and Wen B., “Efficient Image Classification via Structured Low-Rank Matrix Factorization Regression,” IEEE Transactions on Information Forensics and Security, vol. 19, pp. 1496-1509, 2024. DOI:10.1109/TIFS.2023.3337717
[38] Zhang Y., Shi D., Gao J., and Cheng D., “Low- Rank-Sparse Subspace Representation for Robust Regression,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, pp. 2972-2981, 2017. DOI:10.1109/CVPR.2017.317
[39] Zhao L., Chen Y., and Schaffner D., “Comparison of Logistic Regression and Linear Regression in Modeling Percentage Data,” Applied and Environmental Microbiology, vol. 67, no. 5, pp. 2129-2135, 2001. DOI: 10.1128/AEM.67.5.2129- 2135.2001