The International Arab Journal of Information Technology (IAJIT)


Towards Ontology Extraction from Data-Intensive Web Sites: An HTML Forms-Based Reverse

    The  advance  of  the  Web  has  significantly  and  rap idly  changed  the  way  of  information  organization,  sharing  and  distribution.  However,  most  of  the  information  that   is  available  has  to  be  interpreted  by  humans;  mach ine  support  is  rather  limited. The next generation of the web, the semant ic web, seeks to make information more usable by ma chines by introducing  a more rigorous structure based on ontology.  In th is context we try to propose a novel and integrated approach for migrating  data-intensive web into ontology-based semantic web  and thus, make the web content machine-understanda ble. Our approach  is based on the idea that semantics can be extracte d from the structures and the instances of HTML for ms which are the most  convenient  interface  to  communicate  with  relational   databases  on  the  current  Web.  This  semantics  is  ex ploited  to  help  build  ontology.     

[1] Anderson M., Extracting a E.R. Schema from a Relational Database through Reverse Engineering, in Proceedings of the 13th International Conference on the (ERA 94 ), pp. 403-419, 1994.

[2] Astrova I., Reverse Engineering of Relational Databases to Ontologies, in Proceedings of the 1st European Semantic Web Symposium (ESWS), Heraklion, Greece, LNCS, 3053, pp. 327-341, 2004.

[3] Astrova I. and Stantic B., An HTML Forms Driven Approach to Reverse Engineering of Relational Databases to Ontologies, in Proceedings of the 23rd IASTED International Conference on Databases and Applications (DBA), Innsbruck, Austria, pp. 246- 251, 2005.

[4] Baclawski M., Kokar M., Kogut P., and Hart L., Extending UML to Support Ontology Engineering for the Semantic Web, in Proceedings of the Fourth International Conference on UML (UML 2001), Toronto, 2001.

[5] Batini C., Lenzerini M., and Navathe S., A Comparative Analysis of Methodologies for Database Schema Integration, ACM Computing Surveys, vol. 18, no. 4, pp. 323-364, 1986.

[6] Behm A., Geppert K., and Dittrich K., On the Migration of Relational Schemas and Data to Object-Oriented Database Systems, in Proceedings of the 5th International Conference on Re-Technologies for Information Systems, pp. 13-33, 1997.

[7] Benslimane S., Malki M., and Amar D., Automated Migration of Data-Intensive Web Pages into Ontology-Based Semantic Web: A Reverse Engineering Approach, in Meersman R., Tari Z. et al., (eds.) , ODBASE, vol. 2, LNCS 3761, pp. 1640 1649, Springer Verlag, 2005.

[8] Chiang R., Barron T., and Story V., Reverse Engineering of Relational Databases: Extraction of an EER Model from A Relational Database, Data and Knowledge Engineering , vol. 12, no. 2, pp. 107-142, 1994.

[9] Choobineh J., A form-based Approach for Database Analysis and Design, Communication of the ACM , vol. 35, no. 2, pp. 108-120, 1992.

[10] Cranefield S., UML and the Semantic Web, in Proceedings of the International Semantic Web Working Symposium , Palo Alto, 2001.

[11] Embley D., Toward Semantic Understanding An Approach Based on Information Extraction, in Proceedings of the 15th Australasian Database Conference (ADC) , Dunedin, New Zealand, 2004.

[12] Erdmann M ., Maedche A., Schnurr H., and Staab S., From Manual to Semi-automatic Semantic Annotation: About Ontology-based Text Annotation Tools, Buitelaar P., and Hasida K., (eds.), in Proceedings of the Workshop on Semantic Annotation and Intelligent Content (COLING) , 2000.

[13] Falkovych K., Sabou M., and Stuckenschmidt H., UML for the Semantic Web: Transformation- Based Approaches, in B. Omelayenko and M. Klein, (eds.), Knowledge Transformation for the Semantic Web , pp. 92-106, 2003.

[14] Fraternali P., Tools and Approaches for Developing Data-intensive Web Applications: a Survey, ACM Computing Surveys , vol. 31, no. 3, pp. 227-263, 1999.

[15] Gruber T., Toward Principles for the Design of Ontologies used for Knowledge Sharing, Human Computer Studies , vol. 43, no. 5-6, pp. 907-928, 1995.

[16] Kashyap V., Design and Creation of Ontologies for Environmental Information Retrieval, in Proceedings of the 12th Workshop on Knowledge Acquisition, Modeling and Management (KAW) , Banff, Alberta, Canada, 1999.

[17] Malki M., Ayache M., and Rahmouni M., R tro-ing nierie des Bases de Donn es Relationnelles: Approche Bas e sur l Analyse de Formulaires, in Actes du XVII me Congr s INFORSID, Toulon, France, 1999.

[18] Malki M., Flory A., Rahmouni M., Extraction of Object-oriented Schemas from Existing Relational Databases: a Form-driven Approach, INFORMATICA, International Journal (Lithuanian Academy of Sciences) , vol. 13, no. 1, pp. 47-72, 2002.

[19] Mannila H. and R ih K., The Design of Relational Databases, Addison-Wesley, 1994. 44 The International Arab Journal of Informati on Technology, Vol. 5, No. 1, January 2008

[20] Noy N. and Klein M., Ontology Evolution: Not the Same as Schema Evolution, Knowledge and Information Systems, vol. 6, no. 4, pp. 428-440, 2004.

[21] Petit J., Toumani F., and Kouloumdjian J., Relational Database Reverse Engineering: a Method Based on Query Analysis, International Journal of Cooperative Information System , vol. 4, no. 2, pp. 287-316, 1995.

[22] Stojanovic L., Stojanovic N. , and Volz R., Migrating Data-intensive Web Sites into the Semantic Web, in Proceedings of the 17th ACM Symposium on Applied Computing (SAC) , Madrid, Spain, 2002.

[23] Tijerino Y., Embley D., Lonsdale D., Ding Y., and Nagy G., Towards Ontology Generation from Tables , Kluwer Academic Publishers, 2004.

[24] Yang Y. and Zhang H., HTML Page Analysis Based on Visual Cues, in Proceedings of the 6th International Conference on Document Analysis & Recognition (ICDAR) , Seattle, USA, 2001.

[25] Volz R., Handschuh S., Staab S., Stojanovic L., and Stojanovic N., Unveiling the hidden bride: deep annotation for mapping and migrating legacy data to the semantic Web, Journal of Web Semantics: Science, Services and Agents on the Word Wide Web, vol. 1, no. 2, pp. 187-206, 2004.

[26] Wang J. and Lochovsky F., Data Extraction and Label Assignment for Web Databases, in Proceedings of the 12th International Conference on World Wide Web (WWW) , Budapest, Hungary, 2003. Sidi Benslimane is a lecterer in the Department of Computer Science, Sidi Bel Abbes University, Algeria. He received the MSc degree in computer science from Sidi Bel Abbes University, Algeria, in 2001. He is a PhD candidate in Computer Science Department at Sidi Bel Abbes University fro m December 2002. His research interests include semantic web, web engineering, ontology engineering , and information systems. Mimoun Malki is an assistant professor at the Department of Computer Science at Sidi Bel Abbes University. He received the PhD degree in computer science from Sidi Bel Abbes University, Algeria, in 2003. He heads the Evolutionary Engineering and Distributed Information Systems Laboratory. His research interests include, knowled g management, information retrieval, ontology engineering, semantic web, web services, and soft computing systems. Mustapha Rahmouni is a professor at the Computer Science Department of the University of Oran Es-S nia, Algeria. He received the PhD degree in operational research from Southampton University UK, in 1987. He heads the Information Systems Laboratory and the local Doctoral School on STIC. His research interests include formal specifications, informatio n management and integration, process modelling, and knowledge management. Abdellatif Rahmoun is an associate professor at King Faisal University, KSA. He received the PhD degree in computer science from Sidi Bel Abbes University, Algeria, in 1998 . He has been involved in several research projects and teaching in Algeria. His research interests include, logic, genetic algorithms and genetic programming, neural networks and applications, e-learning, e- commerce, and e-business.