The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Utilizing Corpus Statistics for Hindi Word Sense Disambiguation

,
#


[1] Baldwin T., Kim S., Bond F., Fujita S., Martinez D., and Tanaka T., A Reexamination of MRD5 based Word Sense Disambiguation, Journal of ACM Transactions on Asian Language Processing , vol. 9, no. 1, pp. 1521, 2010.

[2] Banerjee S. and Pederson T., An Adapted Lesk Algorithm for Word Sense Disambiguation using WordNet, in Proceedings of the 3 rd International Conference on Computational Linguistics and Intelligent Text Processing , Mexico City, Mexico, pp. 1365145, 2002.

[3] Banerjee S. and Pederson T., Extended Gloss Overlaps as a Measure of Semantic Relatedness, available at: http://www.d.umn.edu/~tpederse/ Pubs/ijcai03.pdf, last visited 2013.

[4] Hindi Corpus, available at: http://www.cfilt.iitb. ac.in/Downloads.html, last visited 2013.

[5] Hindi WordNet, available at: http://www.cfilt.iitb. ac.in/wordnet/webhwn/wn.php, last visited 2013.

[6] Kavitha A., An Integrated Approach for Measuring Semantic Similarity between Words and Sentences using Web Search Engine, the International Arab Journal of Information Technology , vol. 12, no. 6, pp. 5885595, 2015.

[7] Khapra M., Bhattacharyya P., Chauhan S., Nair S., and Sharma A., Domain Specific Iterative Word Sense Disambiguation in a Multilingual Setting, available at: http://core.ac.uk/download/ pdf/23798934.pdf, last visited 2013.

[8] Khapra M., Joshi S., Chatterjee A., and Bhattacharyya P., Together We Can: Bilingual Bootstrapping for WSD, in Proceedings of the 49 th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies , PA, USA, pp. 5615569, 2011.

[9] Khapra M., Shah S., Kedia P., and Bhattacharyya P., Projecting Parameters for Multilingual Word Sense Disambiguation, in Proceedings of the Conference on Empirical Methods in Natural Language Processing , Singapore, pp. 4595467, 2009.

[10] Kilgarriff A. and Rosenzweig J., Framework and Results for English SENSEVAL, Computers and the Humanities , vol. 34, no. 1, pp. 15548, 2000.

[11] Le C. and Shimazu A., High WSD Accuracy using Na ve Bayesian Classifier with Rich Features, in Proceedings of PACLIC 18 , Tokyo, Japan, pp. 1055113, 2004.

[12] Leacock C. and Chodorow M., Combining Local Context and WordNet Sense Similarity for Word Sense Identification WordNet , An Electronic Lexical Database , The MIT Press Cambridge 1998.

[13] Lee H., Baek D., and Rim H., Word Sense Disambiguation based on the Information Theory, in Proceedings of Research on Computational Linguistics Conference , Taiwan, pp. 49558, 1997.

[14] Lee H., Rim H., and Seo H., Word Sense Disambiguation using the Classification Information Model, Computers and the Humanities , vol. 34, no. 1, pp. 1415146, 2000.

[15] Lee Y., Ng H., and Chia T., Supervised Word Sense Disambiguation with Support Vector Machines and Multiple Knowledge Sources, in Proceedings of the 3 rd International Workshop on the Evaluation of Systems for the Semantic Analysis of Text , Barcelona, Spain, pp.1375140, 2004.

[16] Lesk M., Automatic Sense Disambiguation using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone, in Proceedings of the 5 th Annual International Conference on Systems Documentation , Ontario, Canada, pp. 24526, 1986.

[17] Rezapour A., Fakhrahmad S., and Sadreddini M., Applying Weighted KNN to Word Sense Disambiguation, in Proceedings of the World Congress on Engineering , London, UK, pp. 658, 2011.

[18] Sense Annotated Hindi Corpus, available at: http://www.tdil5dc.in/index.php?option=com_up 5download&task=view5download5tool&view= download&toolid=1472, last visited 2013.

[19] Singh S. and Siddiqui T., Evaluating Effect of Context Window Size, Stemming and Stop Word Removal on Hindi Word Sense Disambiguation, in Proceedings of 762 The International Arab Journal of In formation Technology, Vol. 12, No. 6A, 2015 International Conference on Information Retrieval and Knowledge Management , Kuala Lumpur, Malaysia, pp. 155, 2012.

[20] Singh S., Singh V., and Siddiqui T., Hindi Word Sense Disambiguation using Semantic Relatedness Measure, in Proceedings of the 7 th Multi-Disciplinary workshop on Artificial Intelligence , Krabi, Thailand, pp. 2475256, 2013.

[21] Sinha M., Kumar M., Pande P., Kashyap L., and Bhattacharyya P., Hindi Word Sense Disambiguation, available at: http://megha. garudaindia.in/iitb5nlp/hindiwn/papers/HindiWS D.pdf, last visited 2013.

[22] Suderman K., Simple Word Sense Discrimination, Computers and the Humanities , vol. 34, no. 1, pp. 1655170, 2000.

[23] Turney P., Word Sense Disambiguation by Web Mining for Word Co5occurrence Probabilities, available at: file:///C:/Users/ acit_pc/Downloads/ 5763802.pdf, last visited 2013.

[24] Vasilescu F., Langlasi P., and Lapalme G., Evaluating Variants of the Lesk Approach for Disambiguating Words, available at: http://www. lrec5conf.org/proceedings/lrec2004/pdf/219.pdf, last visited 2012.

[25] Yang E., Zhang G., and Zhang Y., The Research of Word Sense Disambiguation Method based on Co5occurrence frequency of Hownet, in Proceedings of the 2 nd Chinese Language Processing Workshop , Hong Kong, China, pp. 605 72, 2000. Satyendr Singh received BE degree in computer science and engineering from Ch. Charan Singh University, Meerut, India in 2000. He obtained ME in computer science and engineering from Panjab University, Chandigarh, India in 2008. Currently, he is pursuing PhD from University of Allahabad, Allahabad, India. His research interests include natural language process ing, information extraction/retrieval, human computer interaction and machine learning. Tanveer Siddiqui is currently Assistant Professor at University of Allahabad, Allahabad, India. She obtained M.Sc. and Ph.D degree in computer science from University of Allahabad. She has experience of teaching and research of more than 14 years in the area of computer science and information technology with special interest in nat ural language processing, human computer interaction and information extraction and retrieval. Utilizing Corpus Statistics for Hindi Word Sense Disambiguation 763 Appendix Table A1. Translation, transliteration and details of sense annotated hindi corpus. Word Sense Number : Translation Of Senses In Englis h (Number Of Instances) (Ang) Sense 1: Any Part or Organ of Human Body (88) Sense 2: Component (30) Sense 3: Part of a Community, Organization or Unit (105) (Ansh) Sense 1: Numerator in Maths in Hindi (42) Sense 2: Component (36) Sense 3: Degree, Measurement of Angle (53) (Achal) Sense 1: Immovable (12) Sense 2: Person s Name (34) Sense 3: Immovable Property (27) (Ashok) Sense 1: Name of a Tree in India (33) Sense 2: Name of an Indian King (21) (Uttar) Sense 1: Answer (30) Sense 2: North Direction (79) Sense 3: A Person s Name (36) (Kadam) Sense 1: Initiative (16) Sense 2: Foot (13) Sense 3: Step (11) (Kamaan) Sense 1: Bow , Curved Piece of Resilient Wood with Taut Cord to Propel Arrows (28) Sense 2: Command (35) Sense 3: An Special Army (Eg, Navy) (33) (Kalam) Sense 1: Pen, Quill (67) Sense 2: Cutting of a Tree (69) Sense 3: Style of Painting of a Particular Place (6 6) Sense 4: Place Near Ear and Cheeks , Where There ar e Hairs (26) (Kaand) Sense 1: Part of Religious Literature (43) Sense 2: Negative Event or Happening (29) (Kumbh) Sense 1: Waterpot Made of Mud (65) Sense 2: A Sun Sign (Aquarius) in Hindi (58) Sense 3: A Holy Event Happing Every 12 Years in Ind ia (64) (Kotaa) Sense 1: Reservation, Quota (70) Sense 2: Name of A District In Rajasthan in India ( 64) (Kriyaa) Sense 1: Verb In Hindi Grammar (116) Sense 2: Activity, Action (71) " $ (Quarter) Sense 1: A Place Allotted To Live for Temporary Pe riod (26) Sense 2: A Quantity of Wine (14) Sense 3: A Match, in Which After Winning, A Player or Team Reaches Semi Final (12) (Khan) Sense 1: Mine (60) Sense 2: Vast Storage of Subject Knowledge or Quali ty (13) Sense 3: Surname of A Muslim Community in India (65 ) % (Galla) Sense 1: Foodgrains (Wheat, Corn, Cereal) (41) Sense 2: Penny Bank, Piggy Bank (29) (Guna) Sense 1: Times (22) Sense 2: Name of a District in Madhya Pradesh in In dia (21) & (Guru) Sense 1:Teacher (89) Sense 2: Jupiter (Name of a Planet) (60) ' (Gram) Sense 1: Village (169) Sense 2: A Unit Of Measurement, Gram (77) (Ghatnaa) Sense 1: Event (65) Sense 2: Lowering Of Water Level, Subside (14) (Chanda) Sense 1: Moon (82) Sense 2: Financial Contribution, Subscription (75) (Charan) Sense 1: Stage, Phase (72) Sense 2: Foot (49) Sense 3: Quarter Part of Anthology (78) (Chaaraa) Sense 1: Domestic Animal s Food, Provender, Forage (100) Sense 2: Option (21) (Chaal) Sense 1: Speed (13) Sense 2: Move to be Taken In Chess or Similar Game s (97) Sense 3: A Place Where People Stay, Tenement House (11) Sense 4: Behavior (37) Sense 5: Strategy in Game, Trick (26) (Jeena) Sense 1: To Live, Survive (39) Sense 2: Staircase (33) (Jeth) Sense 1: Name of a Month in Hindi (10) Sense 2: Husband s Elder Brother, Brother in Law (2 0) + (Tika) Sense 1: A Sign on Forehead Using Sandalwood (15) Sense 2: Vaccination (22) Sense 3: To Write About Something in Detail (24) Sense 4: A Ceremony to Confirm Marriage in India, E ngagement Ceremony (10) Sense 5: A Jewelry Which is Worn by Indian Bride on Forehead (24) , (Dabba) Sense 1: Box , Made Up of Plastic, Wood or Metal, B in (21) Sense 2: Coach of Train Which Carries Passengers (2 4) (Daak) Sense 1: Bid, Bidding (60) Sense 2: Post, Postal System (59) (Dhaal) Sense 1: Sloping or Sliding Land (31) Sense 2: A Protective Covering Used for Saving Atta ck of Sword, Armour (28) (Taan) Sense 1: Process of Stretching (14) Sense 2: Music Tone (19) (Tav) Sense 1: Torrid (18) Sense 2: Ream of Paper (8) . (Til) Sense 1: Sesame, a Plant From Which Oil Is Extracte d From its Seeds (41) Sense 2: Mole (263) (Teer) Sense 1: Arrow (103) Sense 2: Shore of River or Sea (39) (Tulsi) Sense 1: Basil, a Plant Which is Considered Holy an d Medicinal (193) Sense 2: A Saint Who was Follower of God Ram and Wh o Wrote Ramayana (81) (Tel) Sense 1: Oil (128) Sense 2: Crude Oil Obtained From Mines (53) Sense 3: A Ceremony Performed In Indian Marriages ( 14) (Thaan) Sense 1: Roll Of Cloth, Bolt (21) Sense 2: A Place Where Domestic Animals Are Tied (9 ) Sense 3: Place Of Indian God Or Goddess (8) 1 (Daksh) Sense 1: A King in Indian Mythology Who was Father of Sati and Father in Law of Lord Shiva (64) Sense 2: Qualified, Efficient, Skilled (15) (Dar) Sense 1: Standard Cost, Rate (147) Sense 2: Door (67) (Daad) Sense 1: To Praise Someone, Accolade (27) Sense 2: Skin Disease, Ringworm (51) (Daam) Sense 1: Cost, Price (61) Sense 2: Type Of Strategy or Policy (20) (Dhan) Sense 1: Money , Wealth (126) Sense 2: Sign of Addition In Mathematics in Hindi, + (16) (Dhaaraa) Sense 1: Law Charges for Crime In Indian Constituti on, Section (44) Sense 2: River s Flow, Stream (67) Sense 3: Flow of Speech, Thought or Events (50) Sense 4: Electric Current (67) (Dhun) Sense 1: Music Tune (84) Sense 2: Cult, Flakiness, Mania (10) (Phal) Sense 1: Fruit (90) Sense 2: Result (79) Sense 3: Front Sharp Part of Arrowor Spear (11) (Baal) Sense 1: Hair (111) Sense 2: Child (47) (Mat) Sense 1: Religious Community (41) Sense 2: Opinion, Thought, Idea (31) Sense 3: Vote (92) (Maang) Sense 1: Requirement, Need, (13) Sense 2: Parting of Hairs On Head Where Married Hin du Woman Put Vermilion As A Sign of Marriage (33) 4 (Maatra) Sense 1: Quantity , Amount , Volume (41) Sense 2: Some Time Period in Music (8) Sense3 :Vowel Sound in Hindi Speech (39) (Mool) Sense 1: Root Of Plant (6) Sense 2: Basic Reason, Fundamental (49) Sense 3: Time for a Type of Star (97) Sense 4: Capital/Principal Money (40) (Laal) Sense 1: Red Color (129) Sense 2: Son, Child (26) (Vachan) Sense 1: Whatever One Speaks or Says, Saying (23) Sense 2: Promise, Commitment (27) Sense 3: Agent in Hindi Grammar to Denote Singular or Plural (23) $ (Varg) Sense 1: Community, Category, Class (90) Sense 2: Square Object (15) Sense 3: Square of Number, Unit of Measurement of A rea(E.G, Square Feet) (129) 6 7 (Vidhi) Sense 1: Way or Process of Doing Something (72) Sense 2: Law (69) (Sher) Sense 1: Tiger, Lion (166) Sense 2: Type of Urdu Poetry (41) (Sankraman) Sense 1: Process of Sun s Transition From One Star5 Sign to Another (28) Sense 2: Process of Disease Infection (60) Sense 3: Process of Transition From One Place or State to Another Pl ace or State (22) (Sambandh) Sense 1: Relation (23) Sense 2: Agent In Hindi Grammar That Shows Relation Between Two Words (33) Sense 3: Marriage (8) (Seema) Sense 1: Limit, Threshold (28) Sense 2: Boundary, Border (23) (Sona) Sense 1: Gold (65) Sense 2: Sleep (24) (Hal) Sense 1: Solution (26) Sense 2: Ploughing Instrument, Plough (76) (Haar) Sense 1: Defeat (33) Sense 2: Necklace, Garland (63)