The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.01 - Jan. (2014 vol.25)
pp: 93-103
Yan Qiao , University of Florida, Gainesville
Tao Li , University of Florida, Gainesville
Shigang Chen , University of Florida, Gainesville
ABSTRACT
Bloom filters have been extensively applied in many network functions. Their performance is judged by three criteria: query overhead, space requirement, and false positive ratio. Due to wide applicability, any improvement to the performance of Bloom filters can potentially have a broad impact in many areas of networking research. In this paper, we study Bloom-1, a data structure that performs membership check in one memory access, which compares favorably with the $(k)$ memory accesses of a standard Bloom filter. We also generalize Bloom-1 to Bloom-$(g)$ and Bloom-$(\alpha)$, allowing performance tradeoff between membership query overhead and false positive ratio. We thoroughly examine the variants in this family of filters, and show that they can be configured to outperform the Bloom filters with a smaller number of memory accesses, a smaller or equal number of hash bits, and a smaller or comparable false positive ratio in practical scenarios. We also perform experiments based on a real traffic trace to support our filter design.
INDEX TERMS
Arrays, Memory management, Throughput, Hardware, Random access memory, Information filtering,hash requirement, Bloom filter, memory access, false positive
CITATION
Yan Qiao, Tao Li, Shigang Chen, "Fast Bloom Filters and Their Generalization", IEEE Transactions on Parallel & Distributed Systems, vol.25, no. 1, pp. 93-103, Jan. 2014, doi:10.1109/TPDS.2013.46
REFERENCES
[1] Y. Qiao, S. Chen, and T. Li, "One Memory Access Bloom Filters and Their Generalization," Proc. IEEE INFOCOM '11, 2011.
[2] B.H. Bloom, "Space/Time Trade-Offs in Hash Coding with Allowable Errors," Comm. the ACM, vol. 13, no. 7, pp. 422-426, 1970.
[3] A. Broder and M. Mitzenmacher, "Network Applications of Bloom Filters: A Survey," Internet Math., vol. 1, no. 4, pp. 485-509, 2004.
[4] S. Tarkoma, C. Rothenberg, and E. Lagerspetz, "Theory and Practice of Bloom Filters for Distributed Systems," IEEE Comm. Surveys & Tutorials, vol. 99, pp. 1-25, 2012.
[5] S. Dharmapurikar, P. Krishnamurthy, and D. Taylor, "Longest Prefix Matching Using Bloom Filters," Proc. ACM Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm. (SIGCOMM), Aug. 2003.
[6] H. Song, S. Dharmapurikar, J. Turner, and J. Lockwood, "Fast Hash Table Lookup Using Extended Bloom Filter: An Aid to Network Processing," Proc. ACM Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm. (SIGCOMM), Aug. 2005.
[7] H. Song, F. Hao, M. Kodialam, and T. Lakshman, "IPv6 Lookups Using Distributed and Load Balanced Bloom Filters for 100Gbps Core Router Line Cards," Proc. IEEE INFOCOM, Apr. 2009.
[8] A. Kumar, J. Xu, J. Wang, O. Spatschek, and L. Li, "Space-Code Bloom Filter for Efficient Per-Flow Traffic Measurement," Proc. IEEE INFOCOM, Mar. 2004.
[9] Y. Lu and B. Prabhakar, "Robust Counting via Counter Braids: An Error-Resilient Network Measurement Architecture," Proc. IEEE INFOCOM, Apr. 2009.
[10] P. Reynolds and A. Vahdat, "Efficient Peer-to-Peer Keyword Searching," Proc. Int'l Middleware Conf., June 2003.
[11] A. Kumar, J. Xu, and E. Zegura, "Efficient and Scalable Query Routing for Unstructured Peer-to-Peer Networks," Proc. IEEE INFOCOM, Mar. 2005.
[12] L. Fan, P. Cao, J. Almeida, and A. Broder, "Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol," IEEE/ACM Trans. Networking, vol. 8, no. 3, pp. 281-293, June 2000.
[13] L. Maccari, R. Fantacci, P. Neira, and R. Gasca, "Mesh Network Firewalling with Bloom Filters," Proc. IEEE Int'l Conf. Comm., June 2007.
[14] D. Suresh, Z. Guo, B. Buyukkurt, and W. Najjar, "Automatic Compilation Framework for Bloom Filter Based Intrusion Detection," Reconfigurable Computing: Architectures and Applications, vol. 3985, pp. 413-418, 2006.
[15] K. Malde and B. OÕSullivan, "Using Bloom Filters for Large Scale Gene Sequence Analysis in Haskell," Proc. 11th Int'l Symp. Practical Aspects of Declarative Languages, pp. 183-194, 2009.
[16] J. Mullin, "Optimal Semijoins for Distributed Database Systems," IEEE Trans. Software Eng., vol. 16, no. 5, pp. 558-560, May 1990.
[17] W. Wang, H. Jiang, H. Lu, and J. Yu, "Bloom Histogram: Path Selectivity Estimation for XML Data with Updates," Proc. 30th Int'l Conf. Very Large Data Bases (VLDB), pp. 240-251, 2004.
[18] Z. Yuan, J. Miao, Y. Jia, and L. Wang, "Counting Data Stream Based on Improved Counting Bloom Filter," Proc. Ninth Int'l Conf. Web-Age Information Management (WAIM), pp. 512-519, 2008.
[19] F. Chang, J. Dean, S. Ghemawat, W. Hsieh, D. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. Gruber, "Bigtable: A Distributed Storage System for Structured Data," ACM Trans. Computer Systems, vol. 26, no. 2,article 4, 2008.
[20] Y. Lu, A. Montanari, B. Prabhakar, S. Dharmapurikar, and A. Kabbani, "Counter Braids: A Novel Counter Architecture for Per-Flow Measurement," Proc. ACM SIGMETRICS Int'l Conf. Measurement and Modeling of Computer Systems, June 2008.
[21] "Context-Based Access Control (CBAC): Introduction and Configuration," http://www.cisco.com/en/US/products/sw/secursw/ ps1018products_tech_note09186a0080094e8b.shtml , 2008.
[22] E. Horowitz, S. Sahni, and S. Rajasekaran, Computer Algorithms C++ (Chapter 3.2). WH Freeman, 1996.
[23] M. Dietzfelbinger, A. Karlin, K. Mehlhorn, F. Heide, H. Rohnert, and R. Tarjan, "Dynamic Perfect Hashing: Upper and Lower Bounds," SIAM J. Computing, vol. 23, no. 4, pp. 738-761, 1994.
[24] M.K.F. Hao and T. Lakshman, "Building High Accuracy Bloom Filters Using Partitioned Hashing," Proc. ACM SIGMETRICS Int'l Conf. Measurement and Modeling of Computer Systems, June 2007.
[25] F. Hao, M. Kodialam, T. Lakshman, and H. Song, "Fast Multiset Membership Testing Using Combinatorial Bloom Filters," Proc. IEEE INFOCOM, 2009.
[26] D. Guo, J. Wu, H. Chen, Y. Yuan, and X. Luo, "The Dynamic Bloom Filters," IEEE Trans. Knowledge & Data Eng., vol. 22, no. 1, pp. 120-133, Jan. 2010.
[27] O. Rottenstreich, Y. Kanizo, and I. Keslassy, "The Variable-Increment Counting Bloom Filter," Proc. IEEE INFOCOM, 2012.
[28] Y. Lu, B. Prabhakar, and F. Bonomi, "Bloom Filters: Design Innovations and Novel Applications," Proc. Allerton Conf., 2005.
[29] M. Moreira, R. Laufer, P. Velloso, and O. Duarte, "Capacity and Robustness Tradeoffs in Bloom Filters for Distributed Applications," IEEE Trans. Parallel & Distributed Systems, vol. 23, no. 12, pp. 2219-2230, Dec. 2012.
[30] F. Bonomi, M. Mitzenmacher, R. Panigrah, S. Singh, and G. Varghese, "Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines," ACM SIGCOMM Computer Comm. Rev., vol. 36, no. 4, pp. 315-326, 2006.
[31] A. Pagh, R. Pagh, and S. Rao, "An Optimal Bloom Filter Replacement," Proc. ACM-SIAM Symp. Discrete Algorithms, pp. 823-829, 2005.
[32] M. Mitzenmacher, "Compressed Bloom Filters," IEEE/ACM Trans. Networking, vol. 10, no. 5, pp. 604-612, Oct. 2002.
[33] Y. Zhu, H. Jiang, and J. Wang, "Hierarchical Bloom Filter Arrays (HBA): A Novel, Scalable Metadata Management System for Large Cluster-Based Storage," Proc. IEEE Int'l Conf. Cluster Computing, pp. 165-174, 2004.
[34] Y. Chen, A. Kumar, and J. Xu, "A New Design of Bloom Filter for Packet Inspection Speedup," Proc. IEEE GLOBECOM, 2007.
[35] M. Canim, G. Mihaila, B. Bhattacharhee, C. Lang, and K. Ross, "Buffered Bloom Filters on Solid State Storage," Proc. VLDB ADMS Workshop, 2010.
[36] B. Debnath, S. Sengupta, J. Li, D. Lilja, and D. Du, "BloomFlash: Bloom Filter on Flash-Based Storage," Proc. 31st Int'l Conf. Distributed Computing Systems (ICDCS), pp. 635-644, 2011.
[37] S. Lumetta and M. Mitzenmacher, "Using the Power of Two Choices to Improve Bloom Filters," Internet Math., vol. 4, no. 1, pp. 17-33, 2007.
[38] A. Kirsch and M. Mitzenmacher, "Less Hashing, Same Performance: Building a Better Bloom Filter," Proc. 14th Conf. Ann. European Symp., Sept. 2006.
[39] B. Chazelle, J. Kilian, R. Rubinfeld, and A. Tal, "The Bloomier Filter: An Efficient Data Structure for Static Support Lookup Tables," Proc. ACM-SIAM Symp. Discrete Algorithms (SODA), 2004.
[40] S. Cohen and Y. Matias, "Spectral Bloom Filters," Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 241-252, 2003.
[41] Y. Hua and B. Xiao, "A Multi-Attribute Data Structure with Parallel Bloom Filters for Network Services," Proc. 13th Int'l Conf. High Performance Computing (HiPC), pp. 277-288, 2006.
[42] B. Xiao and Y. Hua, "Using Parallel Bloom Filters for Multi-Attribute Representation on Network Services," IEEE Trans. Parallel & Distributed Systems, vol. 21, no. 1, pp. 20-32, Jan. 2010.
[43] E. Porat, "An Optimal Bloom Filter Replacement Based on Matrix Solving," Computer Science-Theory and Applications, pp. 263-273, 2009.
[44] S. Lovett and E. Porat, "A Lower Bound for Dynamic Approximate Membership Data Structures," Proc. Foundations of Computer Science (FOCS), pp. 797-804, 2010.
59 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool