This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Combining Single and Packet-Ray Tracing for Arbitrary Ray Distributions on the Intel MIC Architecture
Sept. 2012 (vol. 18 no. 9)
pp. 1438-1448
M. Ernst, Intel Corp., Santa Clara, CA, USA
S. Woop, Intel Visual Comput. Inst., Saarbruecken, Germany
I. Wald, Intel Visual Comput. Inst., Saarbruecken, Germany
C. Benthin, Intel Visual Comput. Inst., Saarbruecken, Germany
W. R. Mark, Intel Corp., Santa Clara, CA, USA
Wide-SIMD hardware is power and area efficient, but it is challenging to efficiently map ray tracing algorithms to such hardware especially when the rays are incoherent. The two most commonly used schemes are either packet tracing, or relying on a separate traversal stack for each SIMD lane. Both work great for coherent rays, but suffer when rays are incoherent: The former experiences a dramatic loss of SIMD utilization once rays diverge; the latter requires a large local storage, and generates multiple incoherent streams of memory accesses that present challenges for the memory system. In this paper, we introduce a single-ray tracing scheme for incoherent rays that uses just one traversal stack on 16-wide SIMD hardware. It uses a bounding-volume hierarchy with a branching factor of four as the acceleration structure, exploits four-wide SIMD in each box and primitive intersection test, and uses 16-wide SIMD by always performing four such node or primitive tests in parallel. We then extend this scheme to a hybrid tracing scheme that automatically adapts to varying ray coherence by starting out with a 16-wide packet scheme and switching to the new single-ray scheme as soon as rays diverge. We show that on the Intel Many Integrated Core architecture this hybrid scheme consistently, and over a wide range of scenes and ray distributions, outperforms both packet and single-ray tracing.

[1] Intel Corp. "AVX Extensions," http://software.intel.com/en-usavx, 2011.
[2] Intel MIC "Intel Many Integrated Core Architecture," http://download.intel.com/pressroom/archive/ referenceISC_2010_ Skaugen_keynote.pdf , 2010.
[3] T. Aila and S. Laine, "Understanding the Efficiency of Ray Traversal on GPUs," Proc. Conf. High Performance Graphics, 2009.
[4] I. Wald, P. Slusallek, C. Benthin, and M. Wagner, "Interactive Rendering with Coherent Ray Tracing," Computer Graphics Forum, vol. 20, no. 3, pp. 153-164, 2001.
[5] H. Dammertz, J. Hanika, and A. Keller, "Shallow Bounding Volume Hierarchies for Fast SIMD Ray Tracing of Incoherent Rays," Computer Graphics Forum, vol. 27, pp. 1225-1234, 2008.
[6] M. Ernst and G. Greiner, "Multi Bounding Volume Hierarchies," Proc. IEEE/EG Symp. Interactive Ray Tracing, pp. 35-40, 2008.
[7] I. Wald, C. Benthin, and S. Boulos, "Getting Rid of Packets: Efficient SIMD Single-Ray Traversal Using Multi-Branching BVHs," Proc. IEEE/EG Symp. Interactive Ray Tracing, pp. 49-57, 2008.
[8] A. Appel, "Some Techniques for Shading Machine Renderings of Solids," Proc. Spring Joint Computer Conf. (AFIPS), vol. 32, pp. 37-45, 1968.
[9] T. Whitted, "An Improved Illumination Model for Shaded Display," Comm. ACM, vol. 23, no. 6, pp. 343-349, 1980.
[10] R. Cook, T. Porter, and L. Carpenter, "Distributed Ray Tracing," Proc. SIGGRAPH '84, pp. 137-144, 1984.
[11] J.T. Kajiya, "The Rendering Equation," Computer Graphics, vol. 20, pp. 143-150, 1986.
[12] M. Pharr and G. Humphreys, Physically Based Rendering: From Theory to Implementation. Morgan Kaufman, 2004.
[13] V. Havran, "Heuristic Ray Shooting Algorithms," PhD dissertation, Faculty of Electrical Eng., Czech TU in Prague, 2001.
[14] P. Shirley and R.K. Morley, Realistic Ray Tracing, second ed. A.K. Peters, 2003.
[15] J. Goldsmith and J. Salmon, "Automatic Creation of Object Hierarchies for Ray Tracing," IEEE Computer Graphics and Applications, vol. CGA-7, no. 5, pp. 14-20, May 1987.
[16] I. Wald, W.R. Mark, J. Günther, S. Boulos, T. Ize, W. Hunt, S.G. Parker, and P. Shirley, "State of the Art in Ray Tracing Animated Scenes," Proc. Eurographics Conf., 2007.
[17] C.P. Gribble and K. Ramani, "Coherent Ray Tracing via Stream Filtering," Proc. IEEE/EG Symp. Interactive Ray Tracing, pp. 59-66, 2008.
[18] S. Boulos, I. Wald, and C. Benthin, "Adaptive Ray Packet Reordering," Proc. IEEE/EG Symp. Interactive Ray Tracing, pp. 131-138, 2008.
[19] R. Overbeck, R. Ramamoorthi, and W.R. Mark, "Large Ray Packets for Real-Time Whitted Ray Tracing," Proc. IEEE Symp. Interactive Ray Tracing, pp. 41-48, 2008.
[20] J. Tsakok, "Faster Incoherent Rays: Multi-BVH Ray Stream Tracing," Proc. Conf. High Performance Graphics, pp. 151-158, 2009.
[21] J. Hurley, A. Kapustin, A. Reshetov, and A. Soupikov, "Fast Ray Tracing for Modern General Purpose CPU," Proc. GraphiCon Conf., 2002.
[22] P.H. Christensen, J. Fong, D.M. Laur, and D. Batali, "Ray Tracing for the Movie 'Cars'," Proc. IEEE Symp. Interactive Ray Tracing, pp. 1-6, 2006.
[23] V. Havran, R. Herzog, and H.-P. Seidel, "On the Fast Construction of Spatial Hierarchies for Ray Tracing," Proc. IEEE Symp. Interactive Ray Tracing, pp. 71-80, 2006.
[24] W. Hunt, G. Stoll, and W. Mark, "Fast Kd-Tree Construction with an Adaptive Error-Bounded Heuristic," Proc. IEEE Symp. Interactive Ray Tracing, 2006.
[25] S. Popov, J. Günther, H.-P. Seidel, and P. Slusallek, "Experiences with Streaming Construction of SAH KD-Trees," Proc. IEEE Symp. Interactive Ray Tracing, 2006.
[26] M. Shevtsov, A. Soupikov, A. Kapustin, and N. Novorod, "Ray-Triangle Intersection Algorithm for Modern CPU Architectures," Proc. GraphiCon Conf., pp. 33-39, 2007.
[27] T. Kollig and A. Keller, "Efficient Bidirectional Path Tracing by Randomized Quasi-Monte Carlo Integration," Proc. Int'l Conf. Monte Carlo and Quasi-Monte Carlo Methods, pp. 290-305, 2002.
[28] Intel LRBni "C++ Larrabee Prototype Library," http://software. intel.com/en-us/articles prototype-primitives-guide/, 2009.

Index Terms:
ray tracing,multiprocessing systems,parallel architectures,Intel many integrated core architecture,packet-ray tracing,arbitrary ray distributions,Intel MIC architecture,traversal stack,SIMD lane,SIMD utilization,multiple incoherent streams,memory accesses,single-ray tracing scheme,16-wide SIMD hardware,bounding-volume hierarchy,branching factor,primitive intersection test,hybrid tracing scheme,Kernel,Vectors,Ray tracing,Registers,Memory management,Hardware,SIMD processors.,Ray tracing
Citation:
M. Ernst, S. Woop, I. Wald, C. Benthin, W. R. Mark, "Combining Single and Packet-Ray Tracing for Arbitrary Ray Distributions on the Intel MIC Architecture," IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 9, pp. 1438-1448, Sept. 2012, doi:10.1109/TVCG.2011.277
Usage of this product signifies your acceptance of the Terms of Use.