..............................
            ..............................
            ..............................
            
Prediction of Part of Speech Tags for Punjabi using Support Vector Machines
        
        Part-Of-Speech  (POS)tagging  is  a  task  of  assigning  the  appropriatePOSor  lexical  category  to  each  word  in  a
natural  language  sentence.  In  this  paper,  we  have  worked  on  automated  annotation  ofPOStags  for  Punjabi.  We  have
collected  a  corpusof  around  27,000  words,  which  included  the  text  from  various  stories,  essays,  day-to-day  conversations,
poems  etc.,and  divided  these  words  into  different  size  files  for  training  and  testing  purposes.  In  our  approach,  we  have  used
Support  Vector  Machine  (SVM)  for  tagging  Punjabi  sentences.  To  the  best  of  our  knowledge,  SVMs  have  never  been  used  for
taggingPunjabitext.  The  result  shows  that  SVM  based  tagger  hasoutperformed  the  existing  taggers.  In  the  existingPOS
taggers  of  Punjabi,  the  accuracy  ofPOStagging  for  unknown  words  is  less  than  that  for  known  words.  But  in  our  proposed
tagger,  high  accuracy  has  been  achieved  for  unknown  and  ambiguous  words.  The  average  accuracy  of  our  tagger  is  89.86%,
which is better than the existing approaches.    
            [1]Antony P.andSoman K., basedPartof Speech TaggerforKannada, in Proceedings of International Conference on Machine Learning and Cybernetics,Qingdao, pp. 2139-2144, 2010.
[2]Antony P., Mohan S.,andSoman K., VM basedPartofSpeech Taggerfor Malayalam, in Proceedings ofInternational Conference on Recent Trends in Information, Telecommunication and Computing, Kerala, India, pp. 339-341, 2010.
[3]Charniak E., Hendrickson C.,Jacobson N., and Perkowitz M.,tions forPart-of-Speech Tagging, available at: http://cs.brown.edu/research/pubs/pdfs/1993/Cha rniak-1993-EPT.pdf,last visited1993.
[4]Ekbal A.andSpeech Taggingin Bengali usingSupport Vector Machine, in Proceedings ofInternational Conference on Information Technology, Bhubneswar, India, pp. 106-111, 2008.
[5]Gill M., Lehal G.,andJoshi S., ofSpeech TaggingforGrammar Checkingof Punjabi, the Linguistic Journal, vol. 4, no. 1, pp. 6-21, 2009.
[6]Gimenez J.andMarquez L., andAccurate Part-of-Speech Tagging: The SVMApproach Revisited, available at: http://nlp.lsi.upc.edu/ papers/gimenez03.pdf,last visited2004.
[7]Kashyap D.andJosan G., A Trigram Language Model to Predict Part of Speech Tags Using Neural Network, in Proceedings of the14th International Conference, IDEAL, Hefei, China, pp. 513-520, 2013
[8]Kumar D.andJosan G., aTagset forMachine LearningbasedPos Taggingin Punjabi, International Journal of Applied Research on Information Technologyand Computing,vol. 3, no. 2, pp. 132-143, 2012.
[9]Laferty J.,McCallum A.,andPereira F., Random Fields: Probabilistic ModelsforSegmentingandLabeling Sequence Data, in Proceedings of the8thInternational Conference on Machine Learning,San Francisco, USA, pp. 282-289, 2001. 608The International Arab Journal of Information Technology, Vol. 13, No. 6,November2016
[10]Mikheev A., Rule Inductionfor Unknown-Word Guessing, Computational Linguistics,vol. 23, no. 3, pp. 405-423, 1997.
[11]Orphanos G.andChristodoulakis D., DisambiguationandUnknown Word Guessing withDecision Trees, inProceedings of the9th conference on European chapter of the Association for Computational Linguistics, Stroudsburg, USA, pp. 134-141,1999.
[12]Maximum Entropy Modelfor Part-of-Speech Tagging, available at: http://www.aclweb.org/anthology/W96-0213,last visited1996.
[13]Schmid H.,Part-of-Speech TaggingusingDecision Trees, in Proceedings ofInternational Conference on new methods in language processing, Manchester, UK,pp. 44- 49, 1994.
[14]Sharma S.andLehal G., Hidden Markov ModeltoImprovetheAccuracyof Punjabi POS Tagger, in Proceedings ofIEEE International Conference Computer Science and Automation Engineering,Shanghai,pp. 697-701, 2011.
[15]Zribi C., Torjmen A.,andBenAhmed M., Multi-Agent System for POS-Tagging Vocalized Arabic Texts, TheInternationalArabJournal of Information Technology, vol.4, no. 4, pp. 322- 329, 2007 Dinesh KumarisAssociate ProfessorinDepartmentof Information Technologyat DAV Institute of Engineering and Technology, Jalandhar, Punjab, India. Hehas done BTechdegreein ComputerScienceandEngineering, MTechdegreein Information Technology and currently,pursuing PhDdegree inComputer Engineeringfrom thePunjabiUniversity, Patiala. He is member of IEEE, ISTE andCSI (Computer Society of India).Hehas more than 12 years of teaching and research experience. He has supervised more than 10 MTechStudentsin natural language processing, machine learning and computer networks, image processing. GurpreetJosanis Assistant Professor inDepartmentof Computer Scienceat the Punjabi University, Patiala, India. He holds PhD degree inComputer Science from thePunjabi Universityin addition toMTechdegreein Computer Engineering.Hehas more than 12years of teaching and research experience.He has supervised many MTech students andis supervisingfive PhD students innatural language processing,machine learningandcomputer networks. He also leads and teaches modules at bothB.Techand M.Techlevels in computer science.
