The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - Dec. (2011 vol.17)
pp: 1822-1831
Janine C. Bennett , Sandia Nat. Labs., Albuquerque, NM, USA
Vaidyanathan Krishnamoorthy , Sci. Comput. & Imaging Inst., Univ. of Utah, Salt Lake City, UT, USA
Shusen Liu , Sci. Comput. & Imaging Inst., Univ. of Utah, Salt Lake City, UT, USA
Ray W. Grout , Nat. Renewable Energy Lab., Golden, CO, USA
Evatt R. Hawkes , Univ. of New South Wales, Sydney, NSW, Australia
Jacqueline H. Chen , Sandia Nat. Labs., Albuquerque, NM, USA
Jason Shepherd , Sandia Nat. Labs., Albuquerque, NM, USA
Valerio Pascucci , Sci. Comput. & Imaging Inst., Univ. of Utah, Salt Lake City, UT, USA
Peer-Timo Bremer , Lawrence Livermore Nat. Lab., Livermore, CA, USA
ABSTRACT
We present a new framework for feature-based statistical analysis of large-scale scientific data and demonstrate its effectiveness by analyzing features from Direct Numerical Simulations (DNS) of turbulent combustion. Turbulent flows are ubiquitous and account for transport and mixing processes in combustion, astrophysics, fusion, and climate modeling among other disciplines. They are also characterized by coherent structure or organized motion, i.e. nonlocal entities whose geometrical features can directly impact molecular mixing and reactive processes. While traditional multi-point statistics provide correlative information, they lack nonlocal structural information, and hence, fail to provide mechanistic causality information between organized fluid motion and mixing and reactive processes. Hence, it is of great interest to capture and track flow features and their statistics together with their correlation with relevant scalar quantities, e.g. temperature or species concentrations. In our approach we encode the set of all possible flow features by pre-computing merge trees augmented with attributes, such as statistical moments of various scalar fields, e.g. temperature, as well as length-scales computed via spectral analysis. The computation is performed in an efficient streaming manner in a pre-processing step and results in a collection of meta-data that is orders of magnitude smaller than the original simulation data. This meta-data is sufficient to support a fully flexible and interactive analysis of the features, allowing for arbitrary thresholds, providing per-feature statistics, and creating various global diagnostics such as Cumulative Density Functions (CDFs), histograms, or time-series. We combine the analysis with a rendering of the features in a linked-view browser that enables scientists to interactively explore, visualize, and analyze the equivalent of one terabyte of simulation data. We highlight the utility of this new framework for combustion science; however, it is applicable to many other science domains.
INDEX TERMS
Feature extraction, Information analysis, Data mining, Data models, Statistical analysis, Multi-variate Data., Topology, Statistics, Data analysis, Data exploration, Visualization in Physical Sciences and Engineering
CITATION
Janine C. Bennett, Vaidyanathan Krishnamoorthy, Shusen Liu, Ray W. Grout, Evatt R. Hawkes, Jacqueline H. Chen, Jason Shepherd, Valerio Pascucci, Peer-Timo Bremer, "Feature-Based Statistical Analysis of Combustion Simulation Data", IEEE Transactions on Visualization & Computer Graphics, vol.17, no. 12, pp. 1822-1831, Dec. 2011, doi:10.1109/TVCG.2011.199
REFERENCES
[1] Matlab. http:/www.mathworks.com/.
[2] R. http:/cran.r-project.org/.
[3] SAS. http:/www.sas.com/.
[4] VTK - The Visualization Toolkit. http:/vtk.org/.
[5] C. L. Bajaj, V. Pascucci, and D. R. Schikore, The contour spectrum. In IEEE Visualization '97, Oct. 1997.
[6] J. Bennett, P. Pébay, D. Roe, and D. Thompson, Numerically stable, single-pass, parallel statistics algorithms. In Proc. 2009 IEEE International Conference on Cluster Computing, New Orleans, LA, Aug. 2009.
[7] P.-T. Bremer, G. Weber, V. Pascucci, M. Day, and J. Bell, Analyzing and tracking burning structures in lean premixed hydrogen flames. IEEE Transactions on Visualization and Computer Graphics, 16 (2): 248–260, 2010.
[8] P.-T. Bremer, G. H. Weber, V. Pascucci, M. Day, and J. B. Bell, Analyzing and tracking burning structures in lean premixed hydrogen flames. Visualization and Computer Graphics, IEEE Transactions on, 16 (2): 248 – 260, Mar.-Apr. 2010.
[9] H. Carr and J. Snoeyink, Path seeds and flexible isosurfaces - using topology for exploratory visualization. In C. D. H. G.-P. Bonneau, S. Hahmann editor, Proceedings of the symposium on Data visualisation, pages 049–058, Grenoble, France, 2003. Eurographics Association.
[10] H. Carr, J. Snoeyink, and U. Axen, Computing contour trees in all dimensions. Computational Geometry Theory Applications, 24 (3): 75 – 94, 2003.
[11] H. Carr, J. Snoeyink, and M. van de Panne, Simplifying flexible isosur-faces using local geometric measures. In IEEE Visualization '04, pages 497–504. IEEE Computer Society, 2004.
[12] T. F. Chan, G. H. Golub, and R. J. LeVeque, Updating formulae and a pairwise algorithm for computing sample variances. Technical Report STAN-CS-79-773, Stanford University, Department of Computer Science, 1979.
[13] T. F. Chan, G. H. Golub, and R. J. LeVeque, Updating formulae and a pairwise algorithm for computing sample variances. In H. Caussinus et al., editors, COMPSTAT 1982: 5th Symposium held at Toulouse, 1982, pages 30–41,Wien, Austria, 1982. Physica-Verlag.
[14] S. Chaudhuri, and U. Dayal, An overview of data warehousing and OLAP technology. ACM Special Interest Group on Management Of Data Record, 26 (1): 65–74, 1997.
[15] J. H. Chen, A. Choudhary, B. de Supinski, M. DeVries, E. R. Hawkes, S. Klasky, W. K. Liao, K. L. Ma, J. Mellor-Crummey, N. Podhorszki, R. Sankaran, S. Shende, and C. S. Yoo, Terascale direct numerical simulations of turbulent combustion using S3D. Computational Science & Discovery, 2 (1): 015001, 2009.
[16] E. F. Codd, A relational model of data for large shared data banks. Communications of the ACM, 13 (6): 377–387, 19070.
[17] D. Comaniciu and P. Meer, Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24: 603–619, 2002.
[18] D. Comer, The ubiquitous B-tree. Computing Surveys, 11 (2): 121–137, 1979.
[19] D. Cook, H. Pitsch, J. Chen, and E. B. Hawkes, Flamelet-based modeling of auto-ignition with thermal inhomogeneities for application to hcci engines. Proceedings of the Combustion Institute, 31 (2): 2903–2911, 2007.
[20] C. Crassin, F. Neyret, S. Lefebvre, and E. Eisemann, Gigavoxels: ray-guided streaming for efficient and detailed voxel rendering. In Proceedings of the 2009 symposium on Interactive 3D graphics and games, I3D '09, pages 15–22,New York, NY, USA, 2009. ACM.
[21] E. Danovaro, L. D. Floriani, P. Magillo, E. Puppo, and D. Sobrero, Level-of-detail for data analysis and exploration: A historical overview and some new perspectives. Computers and Graphics, 30 (3): 334–344, 2006.
[22] M. Day, J. Bell, P.-T. Bremer, V. Pascucci, V. Beckner, and M. Lijewski, Turbulence effects on cellular burning structures in lean premixed hydrogen flames. Combustion and Flame, 156: 1035 – 1045, 2009.
[23] B. A. Devlin and P. T. Murphy, An architecture for a buisiness and information system. IBM Systems Journal, 27 (1): 60–80, 1988.
[24] S. Dillard, Libtourtre. http://graphics.cs.ucdavis.edu∼sdillard/ libtourtre/dochtml/.
[25] R. W. Grout, J. Chen, E. R. Hawkes, A. Mascarenhas, P.-T. Bremer, and V. Pascucci, Thirty-second international symposium on combustion, 2008.
[26] A. Gyulassy, M. Duchaineau, V. Natarajan, V. Pascucci, E. Bringa, A. Higginbotham, and B. Hamann, Topologically clean distance fields. IEEE Transactions on Computer Graphics and Visualization (TVCG), 13 (6): 1432–1439, 2007.
[27] M. Hadwiger, C. Berger, and H. Hauser, High-quality two-level volume rendering of segmented data sets on consumer graphics hardware. In Proceedings of the 14th IEEE Visualization 2003 (VIS'03), VIS '03, pages 301–308, Washington, DC, USA, 2003. IEEE Computer Society.
[28] J. Hartigan, Clustering Algorithms. Wiley, 1975.
[29] E. Hawkes and J. Chen, Comparison of direct numerical simulation of lean premixed methane-air flames with strained laminar flame calculations. Combustion and Flame, 144: 112–125, 2005.
[30] E. R. Hawkes, R. Sankaran, J. C. Sutherland, and J. H. Chen, Scalar mixing in direct numerical simulations of temporally evolving plane jet flames with skeletal CO/H2 kinetics. Proceedings of the Combustion Institute, 31 (1): 1633–1640, 2007.
[31] W. H. Immon, The data warehouse and data mining. Communications of the ACM, 39 (11): 49–50, 1996.
[32] S. A. Kaiser and J. H. Frank, Imaging of dissipative structures in the near field of a turbulent non-premixed jet flame. Proceedings of the Combustion Institute, 31 (1): 1515–1523, 2007.
[33] D. Laney, P.-T. Bremer, A. Mascarenhas, P. Miller, and V. Pascucci, Understanding the structure of the turbulent mixing layer in hydrodynamic instabilities. IEEE Trans. Visualization and Computer Graphics (TVCG), 12 (5): 1052–1060, 2006.
[34] A. Mascarenhas, R. W. Grout, P.-T. Bremer, E. R. Hawkes, V. Pascucci, and J. H. Chen, Proceedings of the 3rd TopoInVis Workshop, chapter Topological feature extraction for comparison of terascale combustion simulation data, pages 229 – 240. Mathematics and Visualization. Springer, 2010.
[35] P. Pebay, Formulas for robust, one-pass parallel computation of co-variances and arbitrary-order statistical moments. Technical Report SAND2008-6212, Sandia National Laboratories, 2008.
[36] M. Reuter, F.-E. Wolter, M. Shenton, and M. Niethammer, Laplace-beltrami eigenvalues and topological features of eigenfunctions for statistical shape analysis. Computer-Aided Design, 41 (10): 739–755, 2009.
[37] J. Schlosser and M. Rarey, Beyond the virtual screening paradigm: Structure-based searching for new lead compounds. Journal of Chemical Information and Modeling, 49 (4): 800–809, 2009.
[38] Y. Sheikh, E. Kahn, and T. Kanade, Mode-seeking by medoidshifts. In Proc. IEEE International Conference on Computer Vision, pages 1– 8, 2007.
[39] D. Silver, Object-Oriented visualization. IEEE Computer Graphics and Applications, 15 (3): 54–62, May 1995.
[40] T. L. L. Siqueira, R. R. Ciferri, V. C. Times, and C. D. de Aguiar Ciferri, A spatial bitmap-based index for geographical data warehouses. In Proc. SAC, pages 1336–1342, 2009.
[41] T. B., Terriberry. Computing higher-order moments online, 2008. http://people.xiph.org/tterribe/noteshoms.html .
[42] P. Vaishnavi, A. Kronenburg, and C. Pantano, On the spatial resolution for scalar dissipation measurement in turbulent jet flames. J. Fluid Mech., 596: 103–132, 2008.
[43] G. H. Weber, S. E. Dillard, H. Carr, V. Pascucci, and B. Hamann, Topology-controlled volume rendering. IEEE Transactions on Visualization and Computer Graphics, 13 (2): 330–341, Mar./Apr. 2007.
[44] B. P. Welford, Note on a method for calculating corrected sums of squares and products. Technometrics, 4 (3):419–420, 1962.
[45] K. Wu, S. Ahern, E. W. Bethel, J. Chen, H. Childs, E. Cormier-Michel, C. Geddes, J. Gu, H. Hagen, B. Hamann, W. Koegler, J. Laurent, J. Meredith, P. Messmer, E. Otoo, V. Perevoztchikov, A. Poskanzer, O. Ruebel, A. Shoshani, A. Sim, K. Stockinger, G. Weber, and W.-M. Zhang, Fastbit: Interactively searching massive data. J. of Physics: Conference Series, 180 (1), 2009.
[46] K. Wu, E. Otoo, and A. Shoshani, A performance comparison of bitmap indexes. In Proceedings of 10th International Conference on Information and Knowledge Management, pages 559 – 561, 2001.
[47] K. Wu, E. Otoo, and A. Shoshani, Compressing bitmap indexes for faster search operations. In Proceedings of Scientific and Statistical Database Management, pages 99–108, 2002.
24 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool