The International Arab Journal of Information Technology (IAJIT)


Sentiment Analysis System using Hybrid Word

There have been wide ranges of innovations in sentiment analysis in recent past, with most effective ones involving use of various word embeddings methods for analysis of sentiments. GloVe and Word2Vec are acclaimed to be two most frequently used. A common problem with simple pre-trained embedding methods is that these ignore information related to sentiments of input texts and further depend on large text corpus for training purpose and generation of relevant vectors which is hindrance to researches involving smaller sized corpuses. The aim of proposed study is to propose a novel methodology for sentiment analysis that uses hybrid embeddings with a target to enhance features of available pre-trained embedding. Proposed hybrid embeddings use Part of Speech (POS) tagging and word2position vector over fastText with varied assortments of attached vectors to the pre-trained embedding vectors. The resultant form of hybrid embeddings is fed to our ensemble network-Convolutional Recurrent Neural Network (CRNN). The methodology has been tested for accuracy via different Ensemble models of deep learning and standard sentiment dataset with accuracy value of 90.21 using Movie Review (MVR) Dataset V2. Results show that proposed methodology is effective for sentiment analysis and is capable of incorporating even more linguistic knowledge-based techniques to further improve results of sentiment analysis.

[1] Alghamdi N. and Assiri F., “A Comparison of fastText Implementations Using Arabic Text Classification,” in Proceedings of SAI Intelligent Systems Conference, London, pp. 306-311, 2019.

[2] Alqaraleh S., “Novel Turkish Sentiment Analysis System using ConvNet,” The International Arab Journal of Information Technology, vol. 18, no. 4, pp. 554-561, 2021.

[3] Araque O., Corcuera-Platas I., Sánchez-Rada J., and Iglesias C., “Enhancing Deep Learning Sentiment Analysis with Ensemble Techniques in Social Applications,” Expert Systems with Applications, vol. 77, pp. 236-246, 2017.

[4] Bojanowski P., Grave E., Joulin A., and Mikolov T., “Enriching Word Vectors with Subword Information,” Transactions of the Association for Computational Linguistics, vol. 5, pp. 135-146, 2017.

[5] Deriu J., Lucchi A., De Luca V., Severyn A., Müller S., Cieliebak M., Hofmann T., and Jaggi M., “Leveraging Large Amounts of Weakly Supervised Data for Multi-Language Sentiment Classification,” in Proceedings of of the 26th International Conference on World Wide Web, pp. 1045-1052, 2017.

[6] Fu X., Liu W., Xu Y., and Cui L., “Combine Hownet Lexicon to Train Phrase Recursive Autoencoder for Sentence-Level Sentiment Anal,” Neurocomputing, vol. 241, pp. 18-27, 2017.

[7] Giatsoglou M., Vozalis M., Diamantaras K., Vakali A., Sarigiannidis G., and Chatzisavvas K., “Sentiment Analysis Leveraging Emotions and Word Embeddings,” Expert Systems with Applications, vol. 69, pp. 214-224, 2017.

[8] Joulin A., Grave E., Bojanowski P., Douze M., Jégou H., and Mikolov T., “Fasttext. Zip: Compressing Text Classification Models,” arXiv preprint arXiv:1612.03651, 2016.

[9] Khan F., Qamar U., and Bashir S., “Multi- Objective Model Selection (MOMS)-based Semi- Supervised Framework for Sentiment Analysis,” Cognitive Computation, vol. 8, no. 4, pp. 614-628, 2016.

[10] Kim Y., “Convolutional Neural Networks for Sentence Classification,” arXiv preprint arXiv:1408.5882, 2014.

[11] Lauren P., Qu G., Zhang F., and Lendasse A., “Discriminant Document Embeddings with An Extreme Learning Machine for Classifying Clinical Narratives,” Neurocomputing, vol. 277, pp. 129-138, 2018.

[12] Li Y., Pan Q., Yang T., Wang S., Tang J., and Cambria E., “Learning Word Representations for Sentiment Analysis,” Cognitive Computation, vol. 9, no. 6, pp. 843-851, 2017.

[13] Ma Y., Peng H., Khan T., Cambria E., and Hussain A., “Sentic LSTM: A Hybrid Network for Targeted Aspect-Based Sentiment Analysis,” Cognitive Computation, vol. 10, pp. 639-650, 2018.

[14] Medhat W., Hassan A., and Korashy H., “Sentiment Analysis Algorithms and Applications: A Survey,” Ain Shams Engineering Journal, vol. 5, no. 4, pp. 1093-1113, 2014.

[15] Mikolov T., Grave E., Bojanowski P., Puhrsch C., and Joulin A., “Advances in Pre-Training Distributed Word Representations,” in Proceedings of the 11th International Conference on Language Resources and Evaluation, Miyazaki, pp. 52-55, 2018.

[16] Movie Review Dataset, dataset-of-50k-movie-reviews. Last Visited, 2020.

[17] Pennington J., Socher R., and Manning C., “Glove: Global Vectors for Word Representation,” in Proceedings of the Conference Empirical Methods in Natural Language Processing, Doha, pp. 1532-1543, 2014.

[18] Qin P., Xu W., and Guo J., “An Empirical Convolutional Neural Network Approach for Semantic Relation Classification,” Neurocomputing, vol. 190, pp. 1-9, 2016.

[19] Ravi K. and Ravi V., “A Survey on Opinion Mining and Sentiment Analysis: Tasks, Sentiment Analysis System using Hybrid Word Embeddings with Convolutional ... 335 Approaches and Applications,” Knowledge- Based Systems, vol. 89, pp. 14-46, 2015.

[20] Ren Y., Zhang Y., Zhang M., and Ji D., “Improving Twitter Sentiment Classification Using Topic-Enriched Multi-Prototype Word Embeddings,” in Proceedings of 30th AAAI Conference on Artificial Intelligence, Phoenix, pp. 3038-3044, 2016.

[21] Rezaeinia S., Rahmani R., Ghodsi A., and Veisi H., “Sentiment Analysis based on Improved Pre- Trained Word Embeddings,” Expert Systems with Applications, vol. 117, pp. 139-147, 2019.

[22] Santos I., Nedjah N., and de Macedo Mourelle L., “Sentiment Analysis using Convolutional Neural Network with Fasttext Embeddings,” in Proceedings of IEEE Latin American Conference on Computational Intelligence, Arequipa, pp. 1-5, 2017.

[23] Severyn A. and Moschitti A., “Twitter Sentiment Analysis with Deep Convolutional Neural Networks,” in Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, pp. 959-962, 2015.

[24] Tang D., Wei F., Yang N., Zhou M., Liu T., and Qin B., “Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Baltimore, pp. 1555-1565, 2014.

[25] Zhang H., Gan W., and Jiang B., “Machine Learning and Lexicon based Methods for Sentiment Classification: A Survey,” in Proceedings of the 11th Web Information System and Application Conference, Tianjin, pp. 262- 265, 2014. Fahd Alotaibi is an Associate Professor in the Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, Saudi Arabia. His research interests are Artificial Intelligence, Natural Language Processing, Machine Learning, and Information Retrieval. He has written many research papers in reputed international and national journals. Vishal Gupta is serving as Associate Professor in CSE at UIET, P.U. Chandigarh. He has published more than 90 research papers in top international journals and reputed conferences. His research area is natural language processing. He was selected in world top 2% Scientists ranking list-2019 released by Stanford University in Computer Science.