2008 37th International Conference on Parallel Processing (2008)
Sept. 9, 2008 to Sept. 11, 2008
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICPP.2008.25
Similarity search has been widely studied in peer-to-peer environments. In this paper, we propose the Bounded Locality Sensitive Hashing (Bounded LSH) method for similarity search in P2P file systems. Compared to the basic Locality Sensitive Hashing (LSH), Bounded LSH makes improvement on the space saving and quick query response in the similarity search, especially for high-dimensional data objects that exhibit non-uniform distribution property. We present simple and space-efficient Bounded-LSH to map non-uniform data space into load-balanced hash buckets that contain approximate number of objects. Load-balanced hash buckets in Bounded-LSH, in turn, require less number of hash tables while maintaining a high probability of returning the closest objects to requests. Our experiments based on synthetic and real-world datasets showed the feasibility, query and space efficiency of our proposed method.
Y. Hua, D. Feng, B. Xiao and B. Yu, "Bounded LSH for Similarity Search in Peer-to-Peer File Systems," 2008 37th International Conference on Parallel Processing(ICPP), vol. 00, no. , pp. 644-651, 2008.