The International Arab Journal of Information Technology (IAJIT)


A Concept-based Sentiment Analysis Approach for Arabic

Concept-Based Sentiment Analysis (CBSA) methods are considered to be more advanced and more accurate when it compared to ordinary Sentiment Analysis methods, because it has the ability of detecting the emotions that conveyed by multi- word expressions concepts in language. This paper presented a CBSA system for Arabic language which utilizes both of machine learning approaches and concept-based sentiment lexicon. For extracting concepts from Arabic, a rule-based concept extraction algorithm called semantic parser is proposed. Different types of feature extraction and representation techniques are experimented among the building prosses of the sentiment analysis model for the presented Arabic CBSA system. A comprehensive and comparative experiments using different types of classification methods and classifier fusion models, together with different combinations of our proposed feature sets, are used to evaluate and test the presented CBSA system. The experiment results showed that the best performance for the sentiment analysis model is achieved by combined Support Vector Machine-Logistic Regression (SVM-LR) model where it obtained a F-score value of 93.23% using the Concept-Based- Features+Lexicon-Based-Features+Word2vec-Features (CBF+LEX+W2V) features combinations.

[1] Abdelali A., Cowie J., and Soliman H. S., “Arabic Information Retrieval Perspectives,” in Proceedings of 11th Conference on Natural Language Processing, Journes d’Etude sur laParole-Traitement Automatique des Langues Naturelles), Fez, pp. 391-400, 2004.

[2] Abdulla N., Ahmed N., Shehab M., and Al- Ayyoub M., “Arabic Sentiment Analysis: Corpus-Based and Lexicon-Based,” in Proceedings of IEEE Conference on Applied Electrical Engineering and Computing Technologies, Amman, pp. 1-6, 2013.

[3] Agarwal B., Poria S., Mittal N., Gelbukh A., and Hussain A., “Concept-Level Sentiment Analysis with Dependency-Based Semantic Parsing: A Novel Approach,” Cognitive Computation, vol. 7, no. 4, pp. 487-499, 2015.

[4] Badaro G., Baly R., Hajj H., Habash N., and El- Hajj W., “A Large Scale Arabic Sentiment Lexicon for Arabic Opinion Mining,” in Proceedings of EMNLP on Arabic Natural Language Processing, Doha, pp. 165-173, 2014.

[5] Bilgin M. and Köktaş H., “Sentiment Analysis with Term Weighting and Word Vectors,” The International Arab Journal of Information Technology, vol. 16, no. 5, pp. 953-959, 2019.

[6] Bishop M., Pattern Recognition and Machine Learning, Springer, 2006.

[7] Black W., Elkateb S., Rodriguez H., Alkhalifa M., Vossen P., Pease A., and Fellbaum C., “Introducing the Arabic Wordnet Project,” in Proceedings of 3rd International Wordnet A Concept-based Sentiment Analysis Approach for Arabic 787 Conference, South Jeju Island, pp. 295-300, 2006.

[8] Bodas-Sagi D. and Labeaga J., “Using GDELT Data to Evaluatethe Confidence on the Spanish Government Energy Policy,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 3, no. 6, pp. 38- 43, 2016.

[9] Cambria E. and White B., “Jumping NLP Curves: A Review of Natural Language Processing Research,” IEEE Computational Intelligence Magazine, vol. 9, no. 2, pp. 48-57, 2014.

[10] Cambria E., Havasi C., and Hussain A., “SenticNet 2: A Semantic and Affective Resource for Opinion Mining and Sentiment Analysis,” in Proceedings of the 25th International Florida Artificial Intelligence Research Society Conference, Marco Island, pp. 202-207, 2012.

[11] Cambria E., Schuller B., Liu B., Wang H., and Havasi C., “Knowledge-Based Approaches to Concept-Level Sentiment Analysis,” IEEE Intelligent Systems, vol. 28, no. 2, pp. 12-14, 2013.

[12] Cambria E., Soujanya P., Rajiv B., and Björn W., “Senticnet 4: A Semantic Resource for Sentiment Analysis Based on Conceptual Primitives,” in Proceedings of 26th International Conference on Computational Linguistics, Osaka, pp. 2666- 2677, 2016.

[13] Cambria E., Speer R., Havasi C., and Hussain A., “SenticNet: A Publicly Available Semantic Resource for Opinion Mining,” in Proceedings of AAAI Fall Symposium Series, Washington, pp. 14-18, 2010.

[14] Davis J. and Goadrich M., “The Relationship between Precision-Recall and ROC Curves,” in Proceedings of 23rd international conference on Machine learning, Corvallis, pp. 233-240, 2006.

[15] Dhande L. and Patnaik G., “Analyzing Sentiment of Movie Review Data Using Naive Bayes Neural Classifier,” International Journal of Emerging Trends and Technology in Computer Science, vol. 3, no. 4, pp. 313-320, 2014.

[16] Dreiseitl S. and Ohno-Machado L., “Logistic Regression and Artificial Neural Network Classification Models: A Methodology Review,” Journal of Biomedical Informatics, vol. 35, no. 5, pp. 352-359, 2002.

[17] Duwairi R., Faqeeh M., Wardat M., and Alrabadi A., “Sentiment Analysis for Arabizi Text,” in Proceedings of 7th International Conference on Information and Communication Systems, Irbid, pp. 127-132, 2016.

[18] Duwairi M., Marji R., Sha'ban N., and Rushaidat S., “Sentiment Analysis in Arabic Tweets,” in Proceedings of 5th International Conference on Information and Communication Systems, Irbid, pp. 1-6, 2014.

[19] Farra N., Challita E., Assi R., and Hajj H., “Sentence-Level and Document-Level Sentiment Mining for Arabic Texts,” in Proceedings of IEEE International Conference on Data Mining Workshops, Sydney, pp. 1114-1119, 2010.

[20] Joachims T., “Text Categorization with Support Vector Machines: Learning With Many Relevant Features,” in Proceedings of European Conference on Machine Learning, Chemnitz, pp. 137-142, 1998.

[21] Kontopoulos E., Berberidis C., Dergiades T., and Bassiliades N., “Ontology-Based Sentiment Analysis of Twitter Posts,” Expert Systems with Applications, vol. 40, no. 10, pp. 4065-4074, Aug. 2013.

[22] Korayem M., Crandall D., and Abdul-Mageed M., “Subjectivity and Sentiment Analysis of Arabic: A Survey,” in Proceedings of International Conference on Advanced Machine Learning Technologies and Applications, Cairo, pp. 128-139, 2012.

[23] Leetaru K. and Schrodt P., “GDELT: Global Data on Events, Location, and Tone,” in Proceedings of ISA Annual Convention, San Francisco, pp.1-49, 2013.

[24] Li T. and Tsai C., “A Fuzzy Conceptualization Model for Text Mining With Application in Opinion Polarity Classification,” Knowledge- Based Systems, vol. 39, no. Supplement C, pp. 23-33, and 2013.

[25] Mikolov T., Sutskever I., Chen K., Corrado G. S., and Dean J., “Distributed Representations of Words and Phrases and their Compositionality,” in Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, pp. 3111-3119, 2013.

[26] Miller A., Beckwith R., Fellbaum, C., Gross D., and Miller K J., “Introduction to Wordnet: An On-Line Lexical Database,” International journal of lexicography, vol. 3, no. 4, pp. 235- 244, 1990.

[27] Mountassir A., Benbrahim H., and Berrada I., “Some Methods to Address The Problem of Unbalanced Sentiment Classification in an Arabic Context,” in Proceedings of Colloquium in Information Science and Technology, Fez, pp. 43-48, 2012.

[28] Mudinas A., Zhang D., and Levene M., “Combining Lexicon and Learning Based Approaches for Concept-level Sentiment Analysis,” in Proceedings of 1st International Workshop on Issues of Sentiment Discovery and Opinion Mining, New York, pp. 1-8, 2012.

[29] Nasser A. and Sever H., “A Large-Scale Arabic Sentiment Corpus Construction Using Online 788 The International Arab Journal of Information Technology, Vol. 17, No. 5, September 2020 News Media,” Journal of Engineering and Applied Sciences, vol. 13, pp. 7329-7340, 2018.

[30] Priss U., “Formal Concept Analysis in Information Science,” Annual Review of Information Science and Technology, vol. 40, no. 1, pp. 521-543, 2006.

[31] Rushdi-Saleh M., Martín-Valdivia M., Ureña- López L., and Perea-Ortega J., “OCA: Opinion Corpus for Arabic,” Journal of the Association for Information Science and Technology, vol. 62, no. 10, pp. 2045-2054, 2011.

[32] Ruta D. and Gabrys B., “An Overview Of Classifier Fusion Methods,” Computing and Information systems, vol. 7, no. 1, pp. 1-10, 2000.

[33] Sağlam F., Sever H., and Genç B., “Developing Turkish Sentiment Lexicon for Sentiment Analysis Using Online News Media,” in Proceedings of 13th International Conference of Computer Systems and Applications, Agadir, pp. 1-5, 2016.

[34] Shoukry A. and Rafea A., “Sentence-level Arabic Sentiment Analysis,” in Proceedings of International Conference on Collaboration Technologies and Systems, Denver, pp. 546-550, 2012.

[35] Soni S. and Sharaff A., “Sentiment Analysis of Customer Reviews Based on Hidden Markov Model,” in Proceedings of International Conference on Advanced Research in Computer Science Engineering and Technology, New York, pp. 1-5, 2015.

[36] Wille R., “Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts,” in Proceedings of International Conference on Formal Concept Analysis, Berlin, pp. 445-470, 1982. Ahmed Nasser got his BSc degree from University of Technology Control and Systems Eng. Faculty, Baghdad Iraq in 2006, MSc degree in Computer Eng. from Istanbul University, Istanbul Turkey in 2012, PhD degree from Hacettepe University, Ankara Turkey Computer Eng., in 2018. His current interest on “Data Mining”, “Natural Language Processing” and “Machine Learning”. Hayri Sever is currently working for Software Engineering Department atCankaya University. He has received his BSc degree in computer science andengineering from Hacettepe University in Ankara, TR, MSc degree from MaineUniversity in Orono, ME in 1991, PhD degree from Louisiana University inLafayette LA, in 1995. His research areas are Knowledge Discovery in Databases,Multimedia Retrieval Models and Systems, Multimedia Systems, UncertaintyReasoning, Business Process Management, Machine Learning, and Speech Analysis.