The International Arab Journal of Information Technology (IAJIT)

..............................
..............................
..............................


Scalable Self-Organizing Structured P2P Information Retrieval Model Based

 This  paper  proposes  a  new  autonomous  self-organizin g  content-based  node  clustering  peer  to  peer  Information  Retrieval  (P2PIR)  model.  This  model  uses  incrementa l  transitive  document-to-document  similarity  technique  to  build  Local  Equivalence  Classes  (LECes)  of  documents  on  a  sourc e  node.  Locality  Sensitive  Hashing  (LSH)  scheme  is applied  to  map  a  representative  of  each  LEC  into  a  set  of  keys  which   will  be  published  to  hosting  node(s).  Similar  LECes  on  different  nodes  form  Universal  Equivalence  Classes  (UECes),  which  i ndicate  the  connectivity  between  these  nodes.  The  same  LSH  scheme  is  used  to  submit  queries  to  subset  of  nodes  that  most   likely  have  relevant  information.  The  proposed  mod el  has  been  implemented .  The  obtained  results  indicate  efficiency  in  buildi ng  connectivity  between  similar  nodes,  and  correctl y  allocate  and retrieve relevant answers to high percentage of  queries. The system was tested for different network sizes and proved to be  scalable as efficiency downgraded gracefully as the  network size grows exponentially.   


[1] Aberer K., Hauswirth M., and Schmidt R., Improving Data Access in P2P Systems, IEEE Internet Computing , vol. 6, no. 1, pp. 58-67, 2002.

[2] Aberer K., Klemm F., Rajman M., and Wu J., An Architecture for Peer-to-Peer Information Retrieval, in Proceedings of the 26 th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval , Canada, pp. 17-24, 2003.

[3] Bawa M., Manku G., and Raghavan P., Sets: Search Enhanced by Topic Segmentation, in Proceedings of the 26 th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , Canada, pp. 306-313, 2003.

[4] Bhattacharya I., Kashyap S., and Parthasarathy S., Similarity Searching in Peer-to-Peer Databases, in Proceedings of the 25 th IEEE International Conference on Distributed Computing Systems , Columbus, pp. 329-338, 2005.

[5] Broder A., Charikar M., Frieze A., and Mitzenmacher M., Min-Wise Independent Permutations, Journal of Computer and System Sciences , vol. 60, no. 3, pp. 630-699, 2000.

[6] Chirita P., Nejdl W., and Scurtu O., Knowing Where to Search: Personalized Search Strategies for Peers in P2P Networks, in Proceedings of SIGIR Workshop on Peer-to-Peer Information Retrieval , Sheffield, pp. 1-12, 2004.

[7] Cohen E., Fiat A., and Kaplan H., Associative Search in Peer to Peer Networks: Harnessing Latent Semantics, The International Journal of Computer and Telecommunications Networking , vol. 51, no. 8, pp. 1861-1881, 2007.

[8] Crespo A. and Garcia-Molina H., Semantic Overlay Networks for P2P Systems, in Proceedings of the 3 rd International Workshop , Agents and Peer-to-Peer Computing , USA, vol. 3601, pp. 1-13, 2005.

[9] Datar M., Immorlica N., Indyk P., and Mirrokni V., Locality-Sensitive Hashing Scheme Based on p-Stable Distributions, in Proceedings of the 20 th Annual Symposium on Computational Geometry , USA, pp. 253-262, 2004.

[10] Garcia P., Pairot C., Mondejar R., Pujol J., Tejedor H., and Rallo R., PlanetSim: A New Overlay Network Simulation Framework, in Proceedings of the 19 th IEEE International Conference on Automated Software Engineering, Workshop on Software Engineering and Middleware , Austria, pp. 123-136, 2004.

[11] Gupta A., Agrawal D., and Abbadi A., Approximate Range Selection Queries in Peer- to-Peer Systems, in Proceedings of the 1 st Biennial Conference on Innovative Data Systems Research , pp. 254-273, 2003.

[12] Hasan Y., Hassan M., and Ridley M., Incremental Transitivity Applied to Cluster Retrieval, International Arab Journal of Information Technology , vol. 5, no. 3, pp. 311- 319, 2008.

[13] Hasan Y. and Hassan M., Efficient Approach for Building Hierarchical Cluster Representative, International Journal of Computer Science and Network Security , vol. 11, no. 1, pp. 178-184, 2011.

[14] Hassan M. and Hasan Y., Locality Preserving Scheme of Text Databases Representative in Distributed Information Retrieval Systems, in Proceedings of the 2 nd International Conference of Networked Digital Technologies , Berlin, vol. 88, pp. 162-171, 2010.

[15] Jin H. and Chen H., SemreX: Efficient Search in a Semantic Overlay for Literature Retrieval, Future Generation Computer Systems , vol. 24, no. 6, pp. 475-488, 2008.

[16] Jin H., Ning X., Chen H., and Yin Z., Efficient Query Routing for Information Retrieval in Semantic Overlays, in Proceedings of ACM Symposium on Applied Computing , France, pp. 1669-1673, 2006.

[17] Li M., Lee W., and Sivasubramaniam A., Semantic Small World: An Overlay Network for Peer-to-Peer Search, in Proceedings of the 12 th IEEE International Conference on Network Protocols , USA, pp. 228-238, 2004.

[18] Loguinor D., Kumar A., Rai V., and Ganesh S., Graph-Theoretic Analysis of Structured P2P System: Routing Distances and Fault Resilience, IEEE/ACM Transactions on Networking , vol. 13, no. 5, pp. 1107-1120, 2005.

[19] Lua E., Crowcroft J., Pias M., Sharma R., and Lim S., A Survey and Comparison of Peer-to- Peer Overlay Network Schemes, IEEE Communications Survey and Tutorial , vol. 7, no. 2, pp. 72-93, 2004. 86 The International Arab Journal of Information Technology, Vol. 11, No. 1, January 2014

[20] Schmidt C. and Parashar M., Enabling Flexible Queries with Guarantees in P2P Systems, IEEE Internet Computing , vol. 8, no. 3 pp. 19-26, 2004.

[21] Shen H., Shu Y., and Yu B., Efficient Semantic- Based Content Search in P2P Networks, IEEE Transactions on Knowledge and Data Engineering , vol. 16, no. 7, pp. 813-826, 2004.

[22] Stoica I., Morris R., Karger D., Kaashoek M., and Balakrishnan H., Chord: A Scalable Peer- to-Peer Lookup Service for Internet Applications, in Proceedings of Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications , USA, pp. 149-160, 2001.

[23] Yee W. and Frieder O., On Search in PeertoPeer File Sharing Systems, in Proceedings of ACM Symposium on Applied Computing , USA, pp. 1023-1030, 2005.

[24] Yu-En L., Steven H., and Pietro L., Keyword Searching in Hypercubic Manifolds, in Proceedings of the 5 th IEEE International Conference on Peer-to-Peer Computing , pp. 150- 151, 2005.

[25] Zhou A., Zhang R., Qian W., Vu Q., and Hu T., Adaptive Indexing for Content-Based Search in P2P Systems, Data and Knowledge Engineering , vol. 67, no. 3, pp. 381-398, 2008.

[26] Zhua Y. and Hub Y., Efficient Semantic Search on DHT Overlays, Parallel Distributed Computing , vol. 67, no. 5, pp. 604-616, 2007. Yaser Al-Lahham received his BS degree from University of Jordan in 1985, the MS degree from Arab Academy Jordan, in 2004, and the PhD degree in computer science from Bradford University, UK in 2009. He is working as an assistant professor in the Department of Computer Science at Zarqa University in Jordan. His research interest includes P2P information retrieval systems, text clustering, and data mining. Mohammad Hassan received his BS degree from Yarmouk University in Jordan in 1987, the MS degree from Univ. of Jordan, in 1996, and the PhD degree in computer information systems from Bradford University, UK in 2003. He is working as an assistant professor in the department of computer science at Zarqa University in Jordan. His research interest includes information retrieval sy stems and database systems.