The Community for Technology Leaders

Guest Editor's Introduction: Massive Data Visualization

Michael W. , University of Tennessee

Pages: pp. 16-17

The old proverb, "And out of mind as soon as out of sight," attributed to Lord Brooke (1554-1628), generally referred to the passing of evil or harm. In the context of large-scale data analysis and simulation, however, it might foreshadow the loss of important information. This issue focuses on the use and development of visualization techniques for exploring and managing large-scale scientific and textual information. The four articles comprising this issue demonstrate four different approaches to the visualization of scientific data from applications such as landscape ecology, text retrieval, cosmology, computational chemistry, and material science.

The first article in this issue, by William W. Hargrove and Forrest M. Hoffman, discusses the use of a multivariate geographic clustering methodology to classify ecoregions (regions of similar environmental properties) in landscapes. The volume and heterogeneity of spatially explicit data (map layers) now residing in geographic information systems continues to grow as more and more environmental effects (ecological, social, or economic) must be taken into account for the realistic simulation of land-management scenarios. Hargrove and Hoffman demonstrate how 2D clustering techniques for such GIS-based data can be implemented and interpreted using well-known statistical modeling approaches such as principal component analysis.

The second article, by Andrew Booker, Michelle Condliff, Mark Greaves, Fred B. Holt, Anne Kao, Daniel J. Pierce, Stephen Poteet, and Yuan-Jye Jason Wu addresses the clustering and presentation of high-dimensional data arising from the association of terms to documents in information modeling and retrieval. Based on a popular vector space representation of terms and documents, these authors demonstrate how computational mathematics and simple 3D graphs can be used in concert to produce a user-driven information-probing environment for text mining. As the size of text-based digital libraries and of the WWW continues to grow, interactive information-visualization systems will become of paramount importance as manually indexing such repositories will become neither practical nor possible.

As part of the NCSA Computational Observatory, Michael L. Norman, John Shalf, Stuart Levy, and Greg Daues at the University of Illinois, Urbana-Champaign, developed a comprehensive workbench for the 3D visualization of large-scale adaptive mesh refinement (AMR) simulations. As discussed in this article, these researchers have produced an impressive suite of collaborative technologies to advance data analysis in cosmology (such as X-ray galaxy clusters). These technologies include the development of portable file formats, desktop visualization tools, virtual reality applications, and WWW-based tools to archive and analyze AMR data.

The fourth article in this issue by Hans G. Kaper, Sever Tipei, and Elizabeth Wiebel explore a completely different approach in visualizing complex data sets—scientific sonification. These authors collaborated on the synthesis of digital sound using their digital instrument for additive sound synthesis and the visualization of sound objects using M4Cave. Multisensory representations of complex data sets can reveal underlying features that go unnoticed when only a single mode of visualization is applied. The challenge lies in mapping data from the sound domain to the visual domain and vice versa.

One common thread that binds all four approaches to visualization is high-performance computing. Whereas we have traditionally viewed the need for parallel and distributed processing in the realm of problem formation (such as mesh generation and refinement) and solution (such as solving linear and nonlinear systems of equations), the need for fast rendering, display, and user interactivity in visual environments is equally important. Certainly this need will focus and stimulate new research in sophisticated data structures, algorithms, and software for large-scale data management and visualization. Recent initiatives by government agencies such as the National Science Foundation to support these efforts will avoid rendering visualization out of sight as soon as out of mind.

About the Authors

Michael W. Berry is an associate professor of computer science at the University of Tennessee, Knoxville. His technical interests lie in the fields of computational science, parallel computation, information retrieval, numerical linear algebra, and performance evaluation. He received his BS in mathematics from the University of Georgia, his MS in applied mathematics from North Carolina State University, and the PhD in computer science from the University of Illinois, Urbana-Champaign. Contact him at the Dept. of Computer Science, Ayres Hall Rm. 107, Univ. of Tennessee, Knoxville, TN 37996-1301;;
56 ms
(Ver 3.x)