..............................
            ..............................
            ..............................
            
Classifying Sentiment of Dialectal Arabic Reviews: A Semi-Supervised Approach
        
        Arab  Internet  users  tend  to  use dialectical  words to  express  how  they  feel  about  products,  services,  and  places. 
Although, dialects in  Arabic  derived  from  the  formal  Arabic  language,  it  differs  in  several  aspects.  In  general,  Arabic 
sentiment analysis recently attracted lots of researchers’ attention. A considerable  amount of research has been conducted in 
Modern Standard Arabic (MSA), but little work has focused on dialectal Arabic. The presence of the dialect in the Arabic texts 
made  Arabic  sentiment  analysis  is  a  challenging  issue,  due  to  it  usually  does  not  follow  specific rules in  writing  or  speaking 
system. In this paper, we implement a semi-supervised approach for sentiment polarity classification of dialectal reviews with 
the  presence  of  Modern  Standard  Arabic  (MSA). We combined dialectal  sentiment  lexicon  with four  classifying  learning 
algorithm  to perform  the  polarity classification, namely Support Vector Machines (SVM), Naïve  Bayes  (NB), Random  Forest, 
and K-Nearest Neighbor (K-NN). To select the  features  with which the  classifiers can perform  the  best,  we used three feature 
evaluation  methods, namely,  Correlation-based  Feature  Selection,  Principal  Components  Analysis,  and  SVM  Feature 
Evaluation. In the  experiment, we applied the  approach to a data set  which was manually collected. The experimental results 
show that the approach yielded the highest classification accuracy using SVM algorithm with 92.3 %.    
            [1] Abdul-Mageed M., Kübler S., and Diab M., “SAMAR: A System for Subjectivity and Sentiment Analysis of Arabic Social Media,” Computer Speech and Language, vol. 28, no. 1, pp. 20-37, 2014.
[2] Abdulla N., Ahmed N., Shehab M., and Al- Ayyob M., “Arabic Sentiment Analysis: Lexicon-Based and Corpus-Based,” in Proceedings of IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies, Amman, 2013.
[3] Aha D., Kibler D., and Albert M., “Instance- Based Learning Algorithms,” Machine Learning, vol. 6, no. 1, pp. 37-66, 1991.
[4] Al-Subaihin A. and Al-Khalifa H., “A System for Sentiment Analysis of Colloquial Arabic Using Human Computation,” The Scientific World Journal, vol. 2014, 2014.
[5] Alhumoud S., Altuwaijri M., Albuhairi T., and Alohaideb W., “Survey on Arabic Sentiment Analysis in Twitter,” International Science Index, vol. 9, no. 1, pp. 364-368, 2015.
[6] Azmi A. and Alzanin S., “Aara’-A System for Mining The Polarity Of Saudi Public Opinion Through E-Newspaper Comments,” Journal of Information Science, vol. 40, no. 3, pp. 398-410, 2014.
[7] Baly R., El-Khoury G., Moukalled R., Aoun R., Hajj H., Shaban K., and El-Hajj W., “Comparative Evaluation of Sentiment Analysis Methods Across Arabic Dialects,” Procedia Computer Science, vol. 117, pp. 266-273, 2017.
[8] Breiman L., “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[9] Cambria E., Schuller B., Xai Y., and Havasi C., “New Avenues in Opinion Mining and Sentiment Analysis,” IEEE Intelligent Systems, vol. 28, no. 2, pp. 15-21, 2013.
[10] Chang C. and Lin C., “LIBSVM: A Library for Support Vector Machines,” ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, 2011.
[11] Cleveland R., “A Classification for the Arabic Dialects of Jordan,” Bulletin of the American Schools of Oriental Research, vol. 171, pp. 56- 63, 1963.
[12] Cortes C. and Vapnik V., “Support-Vector Networks,” Machine Learning, vol. 20, no. 3, pp. 273-297, 1995.
[13] Duwairi R., Marji R., Sha'ban N., and Rushaidat S., “Sentiment Analysis in Arabic Tweets,” in Proceedings of 5th International Conference on Information and Communication Systems, Irbid, 2014.
[14] Duwairi R., “Sentiment Analysis for Dialectical Arabic,” in Proceedings of 6th International Conference on Information and Communication Systems, Amman, pp. 166-170, 2015.
[15] El-Beltagy S. and Ali A., “Open Issues in the Sentiment Analysis of Arabic Social Media: A Case Study,” in Proceedings of 9th International Conference on Innovations in Information Technology, Abu Dhabi, 2013.
[16] ElSahar H. and El-Beltagy S., “A Fully Automated Approach for Arabic Slang Lexicon Extraction from Microblogs,” in Proceedings of Computational Linguistics and Intelligent Text Processing, Berlin, pp. 79-91, 2014.
[17] Farghaly A. and Shaalan K., “Arabic Natural Language Processing: Challenges and Solutions,” ACM Transactions on Asian Language Information Processing, vol. 8, no. 4, 2009.
[18] Frank E., Hall M., Holmes G., Kirkby R., Pfahringer B., Witten L., and Trigg L., “Weka-A Machine Learning Workbench for Data Mining,” in Proceedings of Data Mining and Knowledge Discovery Handbook, Boston, pp. 1269-1277, 2009.
[19] Guyon I., Weston J., Barnhill S., and Vapnik V., “Gene Selection for Cancer Classification Using Support Vector Machines,” Machine Learning, vol. 46, no. 1-3, pp. 389-422, 2002.
[20] Guyon I. and Elisseeff A., “An Introduction to Variable and Feature Selection,” Journal of Machine Learning Research, pp. 1157-1182, 2003.
[21] Hall M., Correlation-Based Feature Subset Selection For Machine Learning, Theses, University of Waikato, 1999.
[22] Hetzron R., The Semitic Languages, Routledge, 2013.
[23] Ibrahim H., Abdou S., and Gheith M., “Sentiment Analysis for Modern Standard Arabic And Colloquial,” International Journal on Natural Language Computing, vol. 4, no. 2, pp. 95-109, 2015.
[24] John G. and Langley P., “Estimating Continuous Distributions in Bayesian Classifiers,” in Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, Montréal, pp. 338-345, 1995.
[25] Jolliffe I., Principal Component Analysis, Springer Science and Business Media, 2002.
[26] Korayem M., Crandall D., and Abdul-Mageed M., “Subjectivity and Sentiment Analysis of Arabic: A Survey,” in Proceedings of Advanced Machine Learning Technologies and Applications, Berlin, pp. 128-139, 2012.
[27] Liu B., Sentiment Analysis and Subjectivity, Handbook of Natural Language Processing, 2010.
[28] Liu B., Sentiment Analysis and Opinion Mining, Morgan and Claypool Publishers, 2012. 1002 The International Arab Journal of Information Technology, Vol. 16, No. 6, November 2019
[29] Maynard D., Bontcheva K., and Rout D., “Challenges in Developing Opinion Mining Tools For Social Media,” in Proceedings of the@ NLP canu tag# usergeneratedcontent, pp. 15-22, 2012.
[30] McLoughlin L., Colloquial Arabic (Levantine), Routledge, 2009.
[31] Miller C., Arabic in the City: Issues in Dialect Contact and Language Variation, Routledge, 2007.
[32] Mourad A. and Darwish K., “Subjectivity and Sentiment Analysis of Modern Standard Arabic and Arabic Microblogs,” in Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, Atlanta, pp. 55-64, 2013.
[33] Omar N. and Albared M., “Ensemble of Classification Algorithms for Subjectivity and Sentiment Analysis of Arabic Customers' Reviews,” International Journal of Advancements in Computing Technology, vol. 5, no. 14, pp. 77-85, 2013.
[34] Salim J., “Homonymy in Jordanian Colloquial Arabic: A Semantic Investigation,” English Language and Literature Studies, vol. 3, no. 3, pp. 69-76, 2013.
[35] Shaalan K., Bakr H., and Ziedan I., “Transferring Egyptian Colloquial Dialect into Modern Standard Arabic,” in Proceedings of International Conference on Recent Advances in Natural Language Processing, Borovets, 2007.
[36] Stokes J. and Gorman A., Encyclopedia of the Peoples of Africa and the Middle East, The Safavid and Qajar dynasties, 2010.
[37] Taboada M., Brooke J., Tofiloski M., Voll K., and Stede M., “Lexicon-Based Methods for Sentiment Analysis,” Computational Linguistics, vol. 37, no. 2, pp. 267-307, 2011.
[38] Wu X., Kumar V., Quinlan J., Ghosh J., Yang Q., Motoda H., McLachlan G., Ng A., Liu B., Yu P., Zhou P., Steinbach M., Hand D., and Steinberg D., “Top 10 Algorithms In Data Mining,” Knowledge and Information Systems, vol. 14, no. 1, pp. 1-37, 2008. Omar Al-harbi is an assistant professor at the Department of Computer & Information at Jazan Community College, Jazan University, Saudi Arabia Kingdom. He Obtained his PhD in Computer Science with specialization in Artificial Intelligence from Islamic Science University of Malaysia (USIM) in 2013. He previously obtained his Master degree in Information Technology from Northern University of Malaysia (University Utara Malaysia UUM) in 2009. Dr. Omar Alharbi also obtained his Bachelor degree in Computer Science from Jerash University, Jordan in 2007. He has over 8 years of teaching experience. His research interests include natural language processing (NLP), word sense disambiguation (WSD), sentiment analysis, and question answering systems (QA).
