The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


A Novel Approach for Sentiment Analysis of Punjabi Text using SVM

,
Opinion mining or sentiment analysis is to identify and classify the sentiments/opinion/emotions from text. Over the last decade, in addition to english language, many indian languages include interest of research in this field. For this paper, we compared many approaches developed till now and also reviewed previous researches done in case of indian languages like telugu, Hindi and Bengali. We developed a hybrid system for Sentiment analysis of Punjabi text by integrating subjective lexicon, N-gram modelling and support vector machine. Our research includes generation of corpus data, algorithm for Stemming, generation of punjabi subjective lexicon, developing Feature set, Training and testing support vector machine. Our technique proves good in terms of accuracy on the testing data. We also reviewed the results provided by previous approaches to validate the accuracy of our system.


[1] Aarsi, www.punjabiaarsi.blogspot.in, Last Visited 2014.

[2] Agarwal A., Biadsy F., and Mckeown K., Contextual Phrase-Level Polarity Analysis using Lexical a_ect Scoring and Syntactic n-Grams, in Proceedings of 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, pp. 24-32, 2009.

[3] Aj A., www.ajdiawaaz.com, Last Visited 2014.

[4] Ajit, www.beta.ajitjalandhar.com, Last Visited 2014.

[5] Arora P., Sentiment Analysis for Hindi Language, Master s thesis, Hyderabad, 2013.

[6] Baccianella S., Esuli A., and Sebastiani F., Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion Mining, in Proceeding of 7th conference on International Language Resources and Evaluation (LREC'10), Malta, 2010.

[7] Banea C., Mihalcea R., and Wiebe J., A bootstrapping method for building sub-jectivity lexicons for languages with scarce resources, in Proceeding of 6th International Language Resources and Evaluation (LREC'08), Marrakech, 2008.

[8] Khan K., Baharudin B., Khan A., Identifying Product Features from Customer Reviews Using Hybrid Dependency Patterns, The International Arab Journal of Information Technology, vol. 11, no.3, pp. 281-286, 2014.

[9] Blogger, Kamal K., www.kamalkang.blogspot.in, Last Visited 2014.

[10] Chardhikala, www.chardhikala.com, Last Visited 2014.

[11] Daily J., www.dailyjanjagriti.com Last Visited 2014.

[12] Daily T., www.dailypunjabtimes.com, Last Visited 2014.

[13] Das A., Opinion Extraction and Summarization from Text Documents in Bengali, Doctoral Thesis, Jadavpur University, 2011.

[14] Das A. and Bandyopadhyay S., SentiWordNet for Bangla, the Knowledge Sharing Event-4: Task 2: Building Electronic Dictionary, 2010.

[15] Das A. and Bandyopadhyay S., SentiWordNet for Indian Languages, in Proceeding of 8th Workshop on Asian Language Resources, Beijing , pp. 56-63, 2010.

[16] Desh S., www.deshsewak.in, Last Visited 2014.

[17] Desh T., www.deshvideshtimes.com, Last Visited 2014.

[18] Esuli A. and Sebastiani F., Sentiwordnet: A Publicly Available Lexical Resource for Opinion Mining, in Proceedings of 5th Conference on Language Resources and Evaluation, Genoa, pp. 417-422, 2006.

[19] Gupta V., Advances in Signal Processing and Intelligent Recognition System, Springer, 2014.

[20] Hatzivassiloglou V. and McKeown K., Predicting the Semantic Orientation of Adjectives, in Proceeding of 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, Madrid, pp. 174-181, 1997.

[21] Hu M. and Liu B., Mining and Summarizing customer Reviews, in Proceedings of 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, pp. 168-177, 2004.

[22] Indian Institute of Technology, http://www.iith.ac.in/, Last Visited 2014.

[23] Indo T., www.indotimes.com.au, Last Visited 2014.

[24] Intelligent T. and Wilson T., Annotating opinions in world press , In proceeding of the 4th ACL SIGdial Workshop on Discourse and Diaglogue, Sapparo, pp. 13-22, 2003.

[25] Joshi A., Balamurali R., and Bhattacharyya P., A fall-back strategy for sentiment analysis in Hindi: a case study, in Proceeding of 8th 712 The International Arab Journal of Information Technology, Volume 14, No. 5, September 2017 International Conference on Natural Language Processing, India, pp.1-6, 2010.

[26] Kamps J., Marx M., Mokken R., and Rijke M., Using Wordnet to Measure Semantic Orientation of Adjectives, in Proceeding of 4th International Conference on Language Resources and Evaluation, Lisbon, pp. 1115-1118, 2004.

[27] Kaur A. and Gupta V., Proposed Algorithm of Sentiment Analysis for Punjabi Text, Journal of Emerging Technology in Web Intelligence, vol. 6, no. 2, pp. 180-183, 2014.

[28] Kim S., Determining the Sentiment of Opinions in Proceedings of 20th International Conference on Computational Linguistics, Geneva, pp. 1367- 1373, 2004.

[29] Kim S. and Hovy E., Identifying and Analyzing Judgment Opinions, in Proceedings of Human Language Technology Conference/Annual Meeting of the North American Chapter of the Association for Computational Linguistics, New York, pp. 200-207, 2006.

[30] Malwa post, www.malwapost.com, Last Visited 2014.

[31] Nawan Z., www.nawanzamana.in, Last Visited 2014.

[32] Pang B., Lee L., and Vaithyanathan S., Thumbs up? Sentiment Classification using Machine Learning Techniques, in Proceedings of 2002 Conference on Empirical Methods in Natural Language Processing, Pennsylvania, pp. 79-86, 2002.

[33] Punjabi Tribune, www.punjabitribuneonline.com , Last Visited 2014.

[34] Punjab post, www.punjabpost.in, Last Visited 2014.

[35] Punjab Info line, www.punjabinfoline.com, Last Visited 2014.

[36] Punjab Screen online, www.punjab- screen.blogspot.in, Last Visited 2014.

[37] Rao D. and Ravichandran D., Semi-supervised Polarity Lexicon Induction, in Proceeding of 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, pp. 675-682, 2009.

[38] Rojana S., www.rozanaspokesman.com, Last Visited 2014.

[39] Sea Punjab, www.seapunjab.com, Last Visited 2014.

[40] Shabadan p., www.parchanve.wordpress.com, Last Visited 2014.

[41] Shabad S., www.shabadsanjh.com, Last Visited 2014.

[42] Stone P., Dunphy D., Smith M., and Ogilvie D., The General Inquirer: A Computer Approach to Content Analysis, MIT Press, 1966.

[43] Turney P., Thumbs up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews, in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, pp. 417-424, 2002.

[44] University of Waikato, www.cs.waikato.ac.nz/ml/weka/, Last Visited 2014.

[45] Wiebe J. M., Bruce R., and O'Hara T., Development and use of a Gold- Standard Data Set for Subjectivity Classifications, in Proceeding of 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, Stroudsburg, pp. 246-253, 1999.

[46] Wilson T., Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis, in Proceedings of HLT-EMNLP, Vancouver, pp. 347-354, 2005.

[47] Yu H. and Hatzivassiloglou V., Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences, in Proceedings of 2003 conference on Empirical methods in natural language processing, Stroudsburg, pp. 129-136, 2003. Amandeep Kaur has completed her Masters degree in Computer Science & Engineering from Panjab University, Chandigarh, in 2014. Her research interests are Natural Language processing, Image Processing, Machine learning, Computer Vision. Vishal Gupta is Sr. Assistant Professor in Computer Science and Engineering at University Institute of Engineering and Technology, Panjab University Chandigarh. He has written around 70 research papers in international and national journals and conferences. He has developed a number of research projects in field of NLP and text Mining like keywords extraction system, automatic question answering and text summarization system etc.