CSDL Home IEEE Transactions on Visualization & Computer Graphics 2010 vol.16 Issue No.05 - September/October

Subscribe

Issue No.05 - September/October (2010 vol.16)

pp: 815-828

Marc Tchiboukdjian , CNRS and CEA/DAM, DIF, ENSIGMAG-Antenne de Montbonnot, Montbonnot Saint Martin

Vincent Danjean , Grenoble Universités and LIG, ENSIGMAG-Antenne de Montbonnot, Montbonnot Saint Martin

Bruno Raffin , INRIA and LIG, ENSIGMAG-Antenne de Montbonnot, Montbonnot Saint Martin

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2010.19

ABSTRACT

One important bottleneck when visualizing large data sets is the data transfer between processor and memory. Cache-aware (CA) and cache-oblivious (CO) algorithms take into consideration the memory hierarchy to design cache efficient algorithms. CO approaches have the advantage to adapt to unknown and varying memory hierarchies. Recent CA and CO algorithms developed for 3D mesh layouts significantly improve performance of previous approaches, but they lack of theoretical performance guarantees. We present in this paper a {\schmi O}(N\log N) algorithm to compute a CO layout for unstructured but well shaped meshes. We prove that a coherent traversal of a N-size mesh in dimension d induces less than N/B+{\schmi O}(N/M^{1/d}) cache-misses where B and M are the block size and the cache size, respectively. Experiments show that our layout computation is faster and significantly less memory consuming than the best known CO algorithm. Performance is comparable to this algorithm for classical visualization algorithm access patterns, or better when the BSP tree produced while computing the layout is used as an acceleration data structure adjusted to the layout. We also show that cache oblivious approaches lead to significant performance increases on recent GPU architectures.

INDEX TERMS

Cache-aware, cache-oblivious, mesh layouts, data locality, unstructured mesh, isosurface extraction.

CITATION

Marc Tchiboukdjian, Vincent Danjean, Bruno Raffin, "Binary Mesh Partitioning for Cache-Efficient Visualization",

*IEEE Transactions on Visualization & Computer Graphics*, vol.16, no. 5, pp. 815-828, September/October 2010, doi:10.1109/TVCG.2010.19REFERENCES

- [1] NVIDIA, "Nvidia Cuda Programming Guide 2.3.1," 2009.
- [2] V. Pascucci and R. Frank, "Global Static Indexing for Real-Time Exploration of Very Large Regular Grids,"
Proc. Supercomputing, p. 45, 2001.- [3] A. Aggarwal and J.S. Vitter, "The Input/Output Complexity of Sorting and Related Problems,"
Comm. ACM, vol. 31, no. 9, p. 1116, 1988.- [4] M. Frigo, C.E. Leiserson, H. Prokop, and S. Ramachandran, "Cache-Oblivious Algorithms,"
Proc. Ann. Symp. Foundations of Computer Science (FOCS '99), p. 285, 1999.- [5] S.-E. Yoon, P. Lindstrom, V. Pascucci, and D. Manocha, "Cache-Oblivious Mesh Layouts,"
Proc. ACM SIGGRAPH, p. 886, 2005.- [6] G. Miller, S.-H. Teng, W. Thurston, and S. Vavasis, "Geometric Separators for Finite-Element Meshes,"
J. Scientific Computing, vol. 19, no. 2, pp. 364-386, 1998.- [7]
Algorithms for Memory Hierarchies, Advanced Lectures, U. Meyer, P. Sanders, and J. Sibeyn, eds., Springer, 2003.- [8] R.C. Whaley, and A. Petitet, "Minimizing Development and Maintenance Costs in Supporting Persistently Optimized BLAS,"
Software: Practice and Experience, vol. 35, no. 2, pp. 101-121, 2005.- [9] M. Bader and C. Zenger, "Cache Oblivious Matrix Multiplication Using an Element Ordering Based on a Peano Curve,"
Linear Algebra and Its Applications, vol. 417, nos. 2-3, p. 301, 2006.- [10] K. Yotov, T. Roeder, K. Pingali, J. Gunnels, and F. Gustavson, "An Experimental Comparison of Cache-Oblivious and Cache-Conscious Programs,"
Proc. ACM Symp. Parallel Algorithms and Architectures (SPAA '07), pp. 93-104, 2007.- [11] H. Hoppe, "Optimization of Mesh Locality for Transparent Vertex Caching,"
Proc. ACM SIGGRAPH, pp. 269-276, 1999.- [12] A. Bogomjakov and C. Gotsman, "Universal Rendering Sequences for Transparent Vertex Caching of Progressive Meshes,"
Proc. Graphics Interface (GRIN '01), pp. 81-90, 2001.- [13] G. Lin and T.P.-Y. Yu, "An Improved Vertex Caching Scheme for 3D Mesh Rendering,"
IEEE Trans. Visualization and Computer Graphics, vol. 12, no. 4, pp. 640-648, July/Aug. 2006.- [14] P. Sander, D. Nehab, and J. Barczak, "Fast Triangle Reordering for Vertex Locality and Reduced Overdraw,"
ACM Trans. Graphics, vol. 26, no. 3, 2007.- [15] P. Diaz-Gutierrez, A. Bhushan, M. Gopi, and R. Pajarola, "Single-Strips for Fast Interactive Rendering,"
The Visual Computer: Int'l J. Computer Graphics, vol. 22, no. 6, pp. 372-386, 2006.- [16] J. Chhugani and S. Kumar, "Geometry Engine Optimization: Cache Friendly Compressed Representation of Geometry,"
Proc. Symp. Interactive 3D Graphics and Games (I3D '07), pp. 9-16, 2007.- [17] M. Isenburg and P. Lindstrom, "Streaming Meshes,"
Proc. Visualization Conf., pp. 231-238, 2005.- [18] "OpenCCL: Cache-Coherent Layouts," http://www.cs.unc.edu/geom/COLOpenCCL/, 2009.
- [19] S.-E. Yoon and P. Lindstrom, "Mesh Layouts for Block-Based Caches,"
IEEE Trans. Visualization and Computer Graphics, vol. 12, no. 5, pp. 1213-1220, Sept./Oct. 2006.- [20] S.-E. Yoon and D. Manocha, "Cache-Efficient Layouts of Bounding Volume Hierarchies,"
Computer Graphics Forum, vol. 25, no. 3, pp. 507-516, 2006.- [21] J. Wilhelms and A. Van Gelder, "Octrees for Faster Isosurface Generation,"
ACM Trans. Graphics, vol. 11, no. 3, pp. 201-227, 1992.- [22] P. Cignoni, C. Montani, E. Puppo, and R. Scopigno, "Optimal Isosurface Extraction from Irregular Volume Data,"
Proc. Symp. Volume Visualization (VVS '96), pp. 31-38, 1996.- [23] Y.-J. Chiang and C. Silva, "I/O Optimal Isosurface Extraction,"
Proc. Visualization Conf., p. 293, 1997.- [24] Y.-J. Chiang, C. Silva, and W. Schroeder, "Interactive Out-of-Core Isosurface Extraction,"
Proc. Visualization Conf., pp. 167-174, 1998.- [25] Y.-J. Chiang and C. Silva, "External Memory Techniques for Isosurface Extraction in Scientific Visualization,"
Proc. External Memory Algorithms Conf., pp. 247-277, 1999.- [26] W. Schroeder, K. Martin, and B. Lorensen,
The Visualization Toolkit, An Object-Oriented Approach to 3D Graphics, 3rd ed., Kitware Inc., 2004.- [27] K. Clarkson, D. Eppstein, G. Miller, C. Sturtivant, and S.-H. Teng, "Approximating Center Points with Iterated Radon Points,"
Proc. Ann. Symp. Computational Geometry (SoCG '93), pp. 91-98, 1993.- [28] M. Tchiboukdjian, V. Danjean, and B. Raffin, "Binary Mesh Partitioning for Cache-Efficient Processing," INRIA, technical report, 2009.
- [29] P. Bunyk, A. Kaufman, and C. Silva, "Simple, Fast, and Robust Ray Casting of Irregular Grids,"
Proc. Visualization Conf., pp. 30-36, 1997.- [30] P. Shirley and A. Tuchman, "A Polygonal Approximation to Direct Scalar Volume Rendering,"
Proc. ACM SIGGRAPH, vol. 24, no. 5, p. 63, 1990.- [31] S. Callahan, M. Ikits, J. Comba, and C. Silva, "Hardware-Assisted Visibility Sorting for Unstructured Volume Rendering,"
IEEE Trans. Visualization and Computer Graphics, vol. 11, no. 3, pp. 285-295, May-June 2005.- [32] S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci, "A Portable Programming Interface for Performance Evaluation on Modern Processors,"
The Int'l J. High Performance Computing Applications, vol. 14, pp. 189-204, 2000. |