The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Enriching Domain Concepts with Qualitative Attributes: A Text Mining based Approach

Attributes, whether qualitative or non-qualitative are the formal description of any real-world entity and are crucial in modern knowledge representation models like ontology. Though ample evidence for the amount of research done for mining non-qualitative attributes (like part-of relation) extraction from text as well as the Web is available in the wealth of literature, on the other side limited research can be found relating to qualitative attribute (i.e., size, color, taste etc.,) mining. Herein this research article an analytical framework has been proposed to retrieve qualitative attribute values from unstructured domain text. The research objective covers two aspects of information retrieval (1) acquiring quality values from unstructured text and (2) then assigning attribute to them by comparing the Google derived meaning or context of attributes as well as quality value (adjectives). The goal has been accomplished by using a framework which integrates Vector Space Modelling (VSM) with a probabilistic Multinomial Naive Bayes (MNB) classifier. Performance Evaluation has been carried out on two data sets (1) HeiPLAS Development Data set (106 adjective-noun exemplary phrases) and (2) a text data set in Medicinal Plant Domain (MPD). System is found to perform better with probabilistic approach compared to the existing pattern-based framework in the state of art.


[1] Acosta O., Aguilar C., and Sierra G., “Using Relational Adjectives for Extracting Hyponyms from Medical Texts,” in Proceedings of the 1st 924 The International Arab Journal of Information Technology, Vol. 17, No. 6, November 2020 International Workshop on Artificial Intelligence and Cognition, Italy, pp. 33-44, 2013.

[2] Almuhareb A. and Poesio M., “Attribute-Based And Value-Based Clustering: An Evaluation,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, pp. 158-165, 2004.

[3] Almuhareb A., “Attributes in Lexical Acquisition,” Ph. D, Thesis, University of Essex, 2006.

[4] Bakhshandeh O. and Allen J., “From Adjective Glosses to Attribute Concepts: Learning Different Aspects that an Adjective Can Describe,” in Proceedings of the 11th International Conference on Computational Semantics, London, pp. 23-33, 2015.

[5] Boleda G., “Automatic Acquisition of Semantic Classes for Adjectives,” Ph.D. Dissertation, Pompeu Fabra University, 2006.

[6] Buitelaar P., Cimiano P., and Magnini B., “Ontology Learning from Text. An Overview,” Ontology Learning from Text Methods, Evaluation and Applications, vol. 123, pp. 3-12, 2005.

[7] Cimiano P., Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Springer-Verlag New York, 2006.

[8] Doan P., Arch-int N., and Arch-int S., “A Semantic Framework for Extracting Taxonomic Relations From Text Corpus,” The International Arab Journal of Information Technology, vol. 17, no. 3, pp. 325-337, 2020.

[9] Gillani S., “From Text Mining to Knowledge Mining: An Integrated Framework of Concept Extraction and Categorization for Domain Ontology,” Ph.D. Thesis, Corvinus University of Budapest, 2015.

[10] Guarino N., “Concepts, Attributes and Arbitrary Relations: Some Linguistic and Ontological Criteria for Structuring Knowledge Base,” Data and Knowledge Engineering, vol. 8, no. 3, pp. 249-261, 1992.

[11] Hartung M. and Frank A., “A Structured Vector Space Model For Hidden Attribute Meaning in Adjective-Noun Phrases,” in Proceedings of the 23rd International Conference on Computational Linguistics, Beijing, pp. 430-438, 2010.

[12] Hartung M. and Frank A., “Semi-Supervised Type-Based Classification of Adjectives: Distinguishing Properties and Relations,” in Proceedings of the 7th International Conference on Language Resources and Evaluation, Valletta, 2010.

[13] Hartung M., Kaupmann F., Jebbara S., and Cimiano P., “Learning Compositionality Functions on Word Embedding for Modelling Attribute Meaning in Adjective-Noun Phrases,” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, pp. 54-64, 2017.

[14] Hatzivassiloglou V. and McKeown K., “Towards the Automatic Identification of Adjectival Scales: Clustering Adjectives According to Meaning,” in Proceedings of 31st Annual Meeting of the Association for Computational Linguistics, USA, pp. 172-182, 1993.

[15] Kang Y., Haghighi P., and Burstein F., “CFinder: An Intelligent Key Concept Finder from Text for Ontology Development,” Expert Systems with Applications, vol. 41, no. 9, pp. 4494-4504, 2014.

[16] Lee T., Wang Z., Wang H., and Hwang S., “Attribute Extraction and Scoring: A Probabilistic Approach,” in Proceedings of IEEE 29th International Conference on Data Engineering, Brisbane, pp. 194-205, 2013.

[17] Liu Z., Chen Y., Dai Y., Guo C., Zhang Z., and Chen X., “Syntactic and Semantic Features Based Relation Extraction in Agriculture Domain,” in Proceedings of International Conference on Web Information Systems and Applications, Taiyuan, pp. 252-258, 2018.

[18] Mowafy M., Rezk A., and El-bakry H., “An Efficient Classification Model for Unstructured Text Document,” American Journal of Computer Science and Information Technology, vol. 6, no.1, pp. 1-10, 2018.

[19] Nabila N., Basir N., and Deris M., “Non- Taxonomic Relation Extraction Using Probability Theory,” in Proceedings of World Congress on Engineering and Computer Science, San Francisco, pp. 287-301, 2017.

[20] Navarro-Almanza R., Juárez-Ramírez R., Licea G., and Castro J., Intuitionistic and Type-2 Fuzzy Logic Enhancements in Neural and Optimization Algorithms: Theory and Applications, Springer, 2020.

[21] Ong J. and Kliegl R., “Conditional Co- Occurrence Probability Acts like Frequency in Predicting Fixation,” Journal of Eye Movement Research, vol. 2, no. 1, pp. 1-7, 2008.

[22] Petersen W. and Hellwig O., “Exploring The Value Space of Attributes: Unsupervised Bidirectional Clustering of Adjectives in German,” in Proceedings of COLING the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, pp. 2839- 2848, 2016.

[23] Poesio M. and Almuhareb A., “Extracting Concept Descriptions from the Web: the Importance of Attributes and Values,” in Proceedings of the Conference on Ontology Learning and Population, Amsterdam, pp. 29- 44, 2008. Enriching Domain Concepts with Qualitative Attributes: A Text Mining based Approach 925

[24] Rios-Alvarado A., Lopez-Arevalo I., and Sosa- Sosa V., “Learning Concept Hierarchies From Textual Resources for Ontologies Construction,” Expert Systems with Applications, vol. 40, no. 15, pp. 5907-5915, 2013.

[25] Sánchez D., “A Methodology To Learn Ontological Attributes from The Web,” Data and Knowledge Engineering, vol. 69, no. 6, pp. 573- 597, 2010.

[26] Wang C., Fan Y., He X., and Zhou A., “Predicting Hypernym-Hyponym Relations for Chinese Taxonomy Learning,” Knowledge and Information Systems, vol. 58, no. 3, pp. 585-610, 2019.

[27] Zhao G. and Zhang X., “Domain-Specific Ontology Concept Extraction and Hierarchy Extension,” in Proceedings of the 2nd International Conference on Natural Language Processing and Information Retrieval, Bangkok Thailand, pp. 60-64, 2018.

[28] Zhou Y., Zhang L., and Niu S., “The Research of Concept Extraction in Ontology Extension Based on Extended Association Rules,” in Proceedings of IEEE International Conference of Online Analysis and Computing Science, Chongqing, pp. 111-114, 2016. Niyati Kumari Behera has completed her M.Tech. in CSE from N.I.T. Rourkela,Odisha.Currently she is persuing her Ph.D in Anna University, Chennai.She has published extensively in peer reviewed journals and international conferences.Her research interest includes Text Mining, NLP and Machine Learning. Guruvayur Suryanarayanan Mahalakshmi received her Masters in CSE from Anna University, Chennai. She received the Ph.D. during 2009 in the field of Artificial Intelligence. She has authored numerous research articles in Reputed Journals and International Conferences. Presently she is an Associate Professor in Dept. of Computer Science & Engineering, Anna University, Chennai. Her research interests include Machine Learning, Social Networks, Text Mining and Big Data Analytics.