..............................
..............................
..............................
Scalable Self-Organizing Structured P2P Information Retrieval Model Based
This paper proposes a new autonomous self-organizin g content-based node clustering peer to peer Information
Retrieval (P2PIR) model. This model uses incrementa l transitive document-to-document similarity technique to build Local
Equivalence Classes (LECes) of documents on a sourc e node. Locality Sensitive Hashing (LSH) scheme is applied to map a
representative of each LEC into a set of keys which will be published to hosting node(s). Similar LECes on different nodes
form Universal Equivalence Classes (UECes), which i ndicate the connectivity between these nodes. The same LSH scheme is
used to submit queries to subset of nodes that most likely have relevant information. The proposed mod el has been
implemented
. The obtained results indicate efficiency in buildi ng connectivity between similar nodes, and correctl y allocate
and retrieve relevant answers to high percentage of queries. The system was tested for different network sizes and proved to be
scalable as efficiency downgraded gracefully as the network size grows exponentially.
[1] Aberer K., Hauswirth M., and Schmidt R., Improving Data Access in P2P Systems, IEEE Internet Computing , vol. 6, no. 1, pp. 58-67, 2002.
[2] Aberer K., Klemm F., Rajman M., and Wu J., An Architecture for Peer-to-Peer Information Retrieval, in Proceedings of the 26 th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval , Canada, pp. 17-24, 2003.
[3] Bawa M., Manku G., and Raghavan P., Sets: Search Enhanced by Topic Segmentation, in Proceedings of the 26 th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval , Canada, pp. 306-313, 2003.
[4] Bhattacharya I., Kashyap S., and Parthasarathy S., Similarity Searching in Peer-to-Peer Databases, in Proceedings of the 25 th IEEE International Conference on Distributed Computing Systems , Columbus, pp. 329-338, 2005.
[5] Broder A., Charikar M., Frieze A., and Mitzenmacher M., Min-Wise Independent Permutations, Journal of Computer and System Sciences , vol. 60, no. 3, pp. 630-699, 2000.
[6] Chirita P., Nejdl W., and Scurtu O., Knowing Where to Search: Personalized Search Strategies for Peers in P2P Networks, in Proceedings of SIGIR Workshop on Peer-to-Peer Information Retrieval , Sheffield, pp. 1-12, 2004.
[7] Cohen E., Fiat A., and Kaplan H., Associative Search in Peer to Peer Networks: Harnessing Latent Semantics, The International Journal of Computer and Telecommunications Networking , vol. 51, no. 8, pp. 1861-1881, 2007.
[8] Crespo A. and Garcia-Molina H., Semantic Overlay Networks for P2P Systems, in Proceedings of the 3 rd International Workshop , Agents and Peer-to-Peer Computing , USA, vol. 3601, pp. 1-13, 2005.
[9] Datar M., Immorlica N., Indyk P., and Mirrokni V., Locality-Sensitive Hashing Scheme Based on p-Stable Distributions, in Proceedings of the 20 th Annual Symposium on Computational Geometry , USA, pp. 253-262, 2004.
[10] Garcia P., Pairot C., Mondejar R., Pujol J., Tejedor H., and Rallo R., PlanetSim: A New Overlay Network Simulation Framework, in Proceedings of the 19 th IEEE International Conference on Automated Software Engineering, Workshop on Software Engineering and Middleware , Austria, pp. 123-136, 2004.
[11] Gupta A., Agrawal D., and Abbadi A., Approximate Range Selection Queries in Peer- to-Peer Systems, in Proceedings of the 1 st Biennial Conference on Innovative Data Systems Research , pp. 254-273, 2003.
[12] Hasan Y., Hassan M., and Ridley M., Incremental Transitivity Applied to Cluster Retrieval, International Arab Journal of Information Technology , vol. 5, no. 3, pp. 311- 319, 2008.
[13] Hasan Y. and Hassan M., Efficient Approach for Building Hierarchical Cluster Representative, International Journal of Computer Science and Network Security , vol. 11, no. 1, pp. 178-184, 2011.
[14] Hassan M. and Hasan Y., Locality Preserving Scheme of Text Databases Representative in Distributed Information Retrieval Systems, in Proceedings of the 2 nd International Conference of Networked Digital Technologies , Berlin, vol. 88, pp. 162-171, 2010.
[15] Jin H. and Chen H., SemreX: Efficient Search in a Semantic Overlay for Literature Retrieval, Future Generation Computer Systems , vol. 24, no. 6, pp. 475-488, 2008.
[16] Jin H., Ning X., Chen H., and Yin Z., Efficient Query Routing for Information Retrieval in Semantic Overlays, in Proceedings of ACM Symposium on Applied Computing , France, pp. 1669-1673, 2006.
[17] Li M., Lee W., and Sivasubramaniam A., Semantic Small World: An Overlay Network for Peer-to-Peer Search, in Proceedings of the 12 th IEEE International Conference on Network Protocols , USA, pp. 228-238, 2004.
[18] Loguinor D., Kumar A., Rai V., and Ganesh S., Graph-Theoretic Analysis of Structured P2P System: Routing Distances and Fault Resilience, IEEE/ACM Transactions on Networking , vol. 13, no. 5, pp. 1107-1120, 2005.
[19] Lua E., Crowcroft J., Pias M., Sharma R., and Lim S., A Survey and Comparison of Peer-to- Peer Overlay Network Schemes, IEEE Communications Survey and Tutorial , vol. 7, no. 2, pp. 72-93, 2004. 86 The International Arab Journal of Information Technology, Vol. 11, No. 1, January 2014
[20] Schmidt C. and Parashar M., Enabling Flexible Queries with Guarantees in P2P Systems, IEEE Internet Computing , vol. 8, no. 3 pp. 19-26, 2004.
[21] Shen H., Shu Y., and Yu B., Efficient Semantic- Based Content Search in P2P Networks, IEEE Transactions on Knowledge and Data Engineering , vol. 16, no. 7, pp. 813-826, 2004.
[22] Stoica I., Morris R., Karger D., Kaashoek M., and Balakrishnan H., Chord: A Scalable Peer- to-Peer Lookup Service for Internet Applications, in Proceedings of Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications , USA, pp. 149-160, 2001.
[23] Yee W. and Frieder O., On Search in PeertoPeer File Sharing Systems, in Proceedings of ACM Symposium on Applied Computing , USA, pp. 1023-1030, 2005.
[24] Yu-En L., Steven H., and Pietro L., Keyword Searching in Hypercubic Manifolds, in Proceedings of the 5 th IEEE International Conference on Peer-to-Peer Computing , pp. 150- 151, 2005.
[25] Zhou A., Zhang R., Qian W., Vu Q., and Hu T., Adaptive Indexing for Content-Based Search in P2P Systems, Data and Knowledge Engineering , vol. 67, no. 3, pp. 381-398, 2008.
[26] Zhua Y. and Hub Y., Efficient Semantic Search on DHT Overlays, Parallel Distributed Computing , vol. 67, no. 5, pp. 604-616, 2007. Yaser Al-Lahham received his BS degree from University of Jordan in 1985, the MS degree from Arab Academy Jordan, in 2004, and the PhD degree in computer science from Bradford University, UK in 2009. He is working as an assistant professor in the Department of Computer Science at Zarqa University in Jordan. His research interest includes P2P information retrieval systems, text clustering, and data mining. Mohammad Hassan received his BS degree from Yarmouk University in Jordan in 1987, the MS degree from Univ. of Jordan, in 1996, and the PhD degree in computer information systems from Bradford University, UK in 2003. He is working as an assistant professor in the department of computer science at Zarqa University in Jordan. His research interest includes information retrieval sy stems and database systems.