The International Arab Journal of Information Technology (IAJIT)


Adaptive Semantic Indexing of Documents for Locating Relevant Information in P2P Networks

 Locating relevant information in Peer#to#Peer (P2P) system is a challenging problem. Conventional approaches use flooding to locate the content. It is no longer app licable due to massive information available upfron t in the P2P systems. Sometime, it may not be even possible to return sma ll percent of relevant content for a search if it is an unpopular content. In this paper, we present adaptive semantic P2P conten t indexed system. Content indices are generated using topical semantics of documents derived using Wordnet ontology. Similarit ies between document hierarchies are computed using information theoretic approach. It enables locating and retriev al of contents with minimum document movement, sear ch space and nodes to be searched. Results illustrate that our work ca n achieve results better than Content Addressable N etwork (CAN) semantic P2P Information Retrieval (IR) system. Contrary to CAN semantic P2P IR system, we have used content aw are and node aware bootstrapping instead of random bootstrapping of search process.

[1] Al)Lahham Y. and Hassan M., Scalable Self) Organizing Structured P2P Information Retrieval Model based on Equivalence Classes, the International Arab Journal of Information Technology , vol. 11, no. 1, pp. 78)86, 2014.

[2] Anupriya E. and Iyengar N., Concept based Clustering of Documents with Missing Semantic Information, in Proceedings of International Conference on Advanced Computing , Networking and Informatics , Raipur, India, pp. 579)589, 2013.

[3] Anupriya E. and Iyengar N., Peer)to)Peer Coordinated Virtual Clustering of Documents for Information Retrieval, the International Journal of Information Processing and Management , vol. 4, no. 6, pp. 86)98, 2013.

[4] Cuenca)Acuna F., Martin R., and Nguyen T., PlanetP: Using Gossiping and Random Replication to Support Peer)to)Peer Content Search and Retrieval, Technical Report, Rutgers University, 2002.

[5] Eisenhardt M., Muller W., and Henrich A., Classifying Documents by Distributed P2P Clustering, in Proceedings of the 33 rd Annual Meeting of the Society for Computer Science , Frankfurt, Germany, pp. 286)291, 2003.

[6] Gale W., Church K., and Yarowsky D., A Method for Disambiguating Word Senses in a Large Corpus, Computers and the Humanities , vol. 26, no. 5, pp. 415)439, 1992.

[7] Hammouda K. and Kamel M., Distributed Collaborative Web Document Clustering using Cluster Keyphrase Summaries, the Information Fusion Journal , vol. 9, no. 4, pp. 465)480, 2008.

[8] Hammouda K. and Kamel M., Hierarchically Distributed Peer)to)Peer Document Clustering and Cluster Summarization, IEEE Transactions on Knowledge and Data Engineering , vol. 21, no. 5, pp. 681)698, 2009.

[9] Hammouda K. and Kamel M., Phrase)based Document Similarity based on an Index Graph Model, in Proceedings of International Conference on Data Mining , Maebashi, Japan, pp. 203)210, 2002.

[10] Han J., Pei J., Yin Y., and Mao R., Mining Frequent Patterns without Candidate Generation: A Frequent Pattern Tree Approach, Data Mining and Knowledge Discovery , vol. 8, no. 1, pp. 53)87, 2004.

[11] Hassan M. and Abdullah A., A New Grid Resource Discovery Framework, the International Arab Journal of Information Technology , vol. 8, no. 1, pp. 99)107, 2011.

[12] Lin D., An Information)Theoretic Definition of Similarity, in Proceedings of the 15 th International Conference on Machine Learning , Wisconsin, USA, pp. 296)304, 1998.

[13] Liv Q., Cao P., Cohen E., Li K., and Shenker S., Search and Replication in Unstructured Peer)to) Peer Networks, in Proceedings of the 16 th International Conference on Supercomputing , New York, USA, pp. 84)98, 2002. 480 The International Arab Journal of Information Techn ology, Vol. 12, No. 5, September 2015

[14] Papapetrou O., Siberski W., and Nejdl W., PCIR: Combining DHTs and Peer Clusters for Efficient Full Text P2P Indexing, Computer Networks , vol. 54, no. 12, pp. 2019)2040, 2010.

[15] Rhea S. and Kubiatowicz J., Probabilistic Location and Routing, in Proceedings of 21 st Annual Joint Conference of the IEEE Computer and Communications Societies , New York, USA, pp. 1248)1257, 2002.

[16] Schewartz M., A Scalabale, Non Hierarchical Resource Discovery Mechanism based on Probabilistic Protocols, Technical Report, University of Colorado, 1990.

[17] Tang C., Xu Z., and Mahalingam M., PSearch: Information Retrieval in Structured Overlays, ACM SIGCOMM Computer Communication Review , vol. 33, no. 1, pp. 89)94, 2003.

[18] Text Retrieval Conference (TREC)., available at:, last visited 2014.

[19] WordNet)Princeton University., available at:, last visited 2014.

[20] Yan X. and Han J., GSpan: Graph)Based Substructure Pattern Mining, in Proceedings of IEEE International Conference on Data Mining , Maebashi, Japan, pp. 721)723, 2002.

[21] Zhong N., Li Y., and Wu S., Effective Pattern Discovery for Text Mining, IEEE Transactions on Knowledge and Data Engineering , vol. 24, no. 1, pp. 30)44, 2012. Anupriya Elumalai received her Bcs of engineering from Faculty of Computer Science and Engineering, Madras University in 1997 and MS degree of technology in computer science and engineering from VIT University in 2004. Currently, she is working for Information Technology Department, Ibri College of Technology, Sultanate o f Oman. She is pursuing her research in School of Computing Science and Engineering, VIT University, India. Her research interests include peer)to)peer data management, data mining, knowledge discovery from text data and information retrieval. She has 15 publications in Journals and Conferences to her cre dit. She is a member of ISTE, AIENG and IEEE. Sriman Narayana recived his MS degree in applied mathematics, ME degree computer science and engineering PhD degree in applied mathematics. Currently, he is Director of Periyar EVR Central library and also Senior Professor at the School of Computing Science and Engineering at VIT University, India. His resea rch interests include: Distributed computing, informati on security, electronic and mobile commerce applicatio ns, intelligent computing and fluid dynamics (porous media). He had 26 years of teaching and research experience with a credit of nearly 145 publications in reputed International Journals and Conferences. He has authored/co)authored several textbooks/learning materials for the student community. He chaired man y International Conferences, delivered Key note/Invited/Guest/ Technical lectures, served as P C Member/Reviewer. He is Editor in Chief for International Journal of Software Engineering and Applications( IJSEA) of AIRCC, and Editorial Boar d member for International Journals like IJConvC (Inderscience )China), IJCA (USA) etc.