..............................
..............................
..............................
Wavelet Tree based Dual Indexing Technique for Geographical Search
Today’s information retrieval systems are facing new challenges in indexing the massive geographical information
available on internet. Though in past, solutions for it, based on R-tree family and B-tree has been given, but due to increased
size of index, they are found to be less efficient and time consuming. This paper presents a dual indexing technique for
Geographical Information Retrieval. It uses wavelet tree data structure for both, textual and spatial indexing. It also allows
dynamic insertion of Minimum Bounding Rectangle (MBR) in the wavelet tree during index construction. The proposed
technique has been evaluated in terms of efficiency and time complexity. For pure spatial indexing, using this technique, the
search time complexity is reduced and takes even less than one third time of that of spatial indexing performed using R-tree or
R*-tree. Even in case of dual indexing (textual and spatial) using wavelet tree, the search time is reduced by half in
comparison to other techniques such as B/R, B/R* when the search query length is larger than 2 keywords. In case the query is
of 1 or 2 keywords, the search time remains approximately similar to that of other techniques.
[1] Andrade L. and Silva M., “Indexing Structures for Geographic Web Retrieval,” in Proceedings of the Conference on Mobile and Ubiquitous Systems, Guimarães, pp. 33-39, 2006.
[2] Bliujute R., Jensen C., Saltenis S., and Slivinskas G., “R-Tree Based Indexing of Now-Relative Bitemporal Data,” in Proceedings of the 24rd International Conference on Very Large Data Bases, San Francisco, pp. 345-356, 1998.
[3] Brisaboa N., Luaces M., Navarro G., and Seco D., “A New Point Access Method Based on Wavelet Trees,” Advances in Conceptual Time(ms) Time(ms) Query Length Query Length 0 150 300 450 600 750 900 1050 1200 1350 23456789101112131415 R-treeR*-TreeWavelet tree 632 The International Arab Journal of Information Technology, Vol. 16, No. 4, July 2019 Modeling-Challenging Perspectives, Berlin, pp. 297-308, 2009.
[4] Brisaboa N., Luaces M., Navarro G., and Seco D., “A Fun Application of Compact Data Structures to Indexing Geographic Data,” in Proceedings of the 5th International Conference on Fun with Algorithms, Iscia, pp. 77-88, 2010.
[5] Claude F., Navarro G., and Ordónez A., “The Wavelet Matrix: An Efficient Wavelet Tree for Large Alphabets,” Information Systems, vol. 47, pp. 15-32, 2015.
[6] Elabd E., Alshari E., and Abdulkader H., “Semantic Boolean Arabic Information Retrieval,” The International Arab Journal of Information Technology, vol. 12, no. 3, pp. 311- 316, 2015.
[7] Gagie T., Navarro G., and Puglisi S., “New Algorithms on Wavelet Trees and Applications to Information Retrieval,” Theoretical Computer Science, vol. 426-427, pp. 25-41, 2012.
[8] Gaede V. and Gjnther O., “Multidimensional Access Methods,” ACM Computing Surveys, vol. 30, no. 2, pp. 170-231, 1998.
[9] Hariharan R., Hore B., Li C., and Mehrotra S., “Processing Spatial-Keyword (SK) Queries in Geographic Information Retrieval (GIR) Systems,” in Proceedings of 19th International Conference on Scientific and Statistical Database Management, Alta, pp. 16, 2007.
[10] Jones C., Abdelmoty A., Finch D., Fu G., and Vaid S., “The SPIRIT Spatial Search Engine: Architecture, Ontologies and Spatial Indexing,” in Proceedings of Geographic Information Science, Berlin, pp. 125-139, 2004.
[11] Lieberman M., Samet H., Sankaranarayanan J., and Sperling J., “STEWARD: Architecture of A Spatio-Textual Search Engine,” in Proceedings of the 15th Annual ACM International Symposium on Advances in Geographic Information Systems, Seattle, pp. 186-193, 2007.
[12] Lin X., Yu B., and Ban Y., “On Indexing Mechanism in Geographical Information Retrieval System,” in Proceedings of 10th AGILE International Conference on Geographic Information Science, Denmark, pp. 1-3, 2007.
[13] Navarro G., “Wavelet Trees for All,” Journal of Discrete Algorithms, vol. 25, pp. 2-20, 2014.
[14] Papadias D., Sellis T., Theodoridis Y., and Egenhofer M., “Topological Relations in The World of Minimum Bounding Rectangles: A Study With R-Trees,” in Proceedings of the ACM SIGMOD International Conference on Management of Data, San Jose, pp. 92-103, 1995.
[15] Su Q. and Widom J., “Indexing Relational Database Content Offline for Efficient Keyword- Based Search,” in Proceedings of 9th International Database Engineering and Application Symposium, Montreal, pp. 297-305, 2005.
[16] Yadav A. and Yadav D., “Wavelet Tree based Hybrid Geo-Textual Indexing Technique for Geographical Search,” Indian Journal of Science and Technology, vol. 8, no. 33, pp. 1-7, 2015.
[17] Yadav A., Yadav D., and Prasad R., “Efficient Textual Web Retrieval using Wavelet Tree,” International Journal of Information Retrieval Research, vol. 6, no. 4, pp. 16-29, 2016.
[18] Zhou Y., Xie X., Wang C., Gong Y., and Ma W., “Hybrid Index Structures for Location-Based Web Search,” in Proceedings of the 14th ACM International Conference on Information and Knowledge Management ACM, Bremen, pp. 155- 166, 2005. Arun Kumar Yadav is Associate Professor in the department of Computer Science and Engineering at Ajay Kumar Garg Engineering College, Ghaziabad (UP) India. He received PhD. degree in Computer Science and Engineering 2016. He guided 1 M. Tech. dissertation and published more than 10 research papers in reputed international/national journals and presented at conferences. His areas of interest are Information retrieval and Database Management System. Divakar Yadav is Associate Professor in the department of Computer Science and Engineering at Madan Mohan Malaviya University of Technology, Gorakhpur (UP) India. He received PhD. degree in Computer Science and Engineering 2010. He spent one year, from Oct 2011 to Oct 2012, in Carlos III University, Leganes, Madrid, Spain as a postdoctoral fellow. He guided four PhD students, 14 M.Tech. dissertation and published more than 65 research papers in reputed international/national journals and presented at conferences. His areas of interest are Information retrieval and soft computing.