An Integrated Exploration Approach to Visualizing Multivariate Particle Data
JULY/AUGUST 2008 (Vol. 10, No. 4) pp. 20-29
1521-9615/08/$31.00 © 2008 IEEE

Published by the IEEE Computer Society
An Integrated Exploration Approach to Visualizing Multivariate Particle Data
Chad Jones , University of California, Davis

Kwan-Liu Ma

Stéphane Ethier , Princeton Plasma Physics Laboratory

Wei-Li Lee , Princeton Plasma Physics Laboratory
  Article Contents  
  Gyrokinetic Particle Simulations  
  Particle Exploration System  
  Results  
  Conclusion  
  References  
Download Citation
   
Download Content
 
PDFs Require Adobe Acrobat
 

The authors describe a data exploration system that visualizes time-varying, multivariate, point-based data from gyrokinetic particle simulations. By using two interaction modes, their system lets researchers explore collections of densely packed particles and discover interesting aspects, such as the location and motion of particles trapped in turbulent plasma flow.

When studying complex phenomena, researchers often use numerical simulations or experiments to isolate and understand possible contributing factors; depending on their simulation needs, they might also collect various types of data (such as scalar, particle, or vector fields). Our system focuses on exploring particle data, which is often used to capture time-dependent changes in simulations from several fields of study, including materials science and physics. In this article, we demonstrate our system's features using data from plasma physics simulations.
Plasma physics is rich in complex, collective phenomena and encompasses major areas of research, including plasma astrophysics and fusion-energy science. As part of their mission to develop practical fusion energy, researchers at the Princeton Plasma Physics Laboratory (PPPL) have made extensive use of particle-in-cell simulations to advance the understanding of energy and particle transport in fusion devices called tokamaks. 1 , 2 With simulations using more than a billion particles, the amount of data produced can truly be overwhelming. Traditional analyses have been limited to the evaluation of macroscopic quantities, such as the heat and particle fluxes in different plasma regions, field energy, flows, and other derived quantities calculated using the moments of the particle-distribution function. The resulting visualizations have been mainly x-y plots, with a few contour plots evolving over time. But modern, multidimensional visualizations of fundamental particle quantities, such as the ones we introduce in this article, can elevate the analysis to a whole new level for fusion scientists. Although initially unfamiliar with these new ways of exploring data, researchers can more effectively confirm or discover data properties that they just couldn't see before.
To solve the challenges that data exploration poses, an application must provide an understanding of particles on global and refined scales. Traditional scientific visualization techniques for rendering particles in physical space provide an understanding of key spatial relationships among variables, but a global understanding of multivariate connections is difficult to convey because no intuitive representation of multidimensional data exists with physical 3D rendering. By integrating multiple data representations, including techniques developed for information visualization, we can alleviate the limitation on expression and control.
From a top-level data view, researchers can use our approach to interact with various parts of the visualization and select groups of particles with specific multidimensional connections. The multivariate aspect of selection is handled via parallel coordinates: selected variables are highlighted, combined with logical operators, or modified across time. Based on the selection lock function's mode, the group of particles examined across time might differ, but researchers gain the ability to examine multiple time-dependent features. All these views culminate in a final data exploration interface, with a modification to any one particular view updates the appearance of all the other views. This type of linking is essential for revealing particle properties that might be illusive through previous means of particle visualization. Based on initial results, our system has provided PPPL scientists with the ability to explore connections between multiple particle variables, both fundamental and derived. Complicated groupings that would otherwise be difficult to isolate become easy to discover using our multiple-view approach of particle visualization and interaction.
Gyrokinetic Particle Simulations
The main goal of gyrokinetic particle simulations is to study the anomalous energy transport associated with plasma microturbulence. By using 3D gyrokinetic particle-in-cell simulation code, plasma microturbulence studies have dramatically improved the knowledge of instabilities and their effect on plasma confinement. To sustain the required high temperature for fusion reactions, fusion devices must confine the plasma particles using a strong magnetic field, which prevents them from reaching the wall and cooling. The closed geometry containing the necessary properties for plasma confinement is the torus, which is the geometry of all current fusion devices. Due to this geometry and to magnetic confinement, the gradients of temperature, density, and magnetic field generate complex particle motions and drive turbulence in the plasma. PPPL's Gyrokinetic Toroidal Code (GTC) was developed to study these phenomena and has already led to several new insights. 2
The simulation outputs Maxwell potential data in a scalar volume. David Crawford and his coauthors 3 developed a hardware-accelerated volume visualization approach to render the scalar potential data of gyrokinetic simulations. Their key insight was to introduce a transform from the irregular torus shape into an unwrapped square-toroid texture. The ability to explore the time-varying, multivariate particle data, however, is an area of great interest that can benefit from new visualization techniques and user interfaces.
Besides providing a global data view, isolating particle subsets based on multivariate properties is a highly desirable feature when only a small number of particles represent the aspect of interest. For example, the concept of particle trapping in magnetic confinement devices is crucial for understanding energy and particle transport. As particles move along magnetic field lines in a fast spiral motion, they go through regions of increasing magnetic field strength, and energy is transferred from parallel motion to perpendicular motion. When the particle energy is below a certain threshold, the parallel velocity approaches zero in the strong field region and the particle changes direction, exhibiting a back-and-forth motion characteristic of a trapped particle. Above the energy threshold, the so-called passing particles continually circulate around the torus without changing direction. The magnetically trapped particles have a much greater impact on transport than the passing particles. Studying this property in a visualization system is difficult without a way to separate such particles from the larger set.
Particle Exploration System
Unlike analytical and numerical calculations, which are the traditional methods for studying simulation data, a visualization system can provide an intuitive means to explore and understand data. The tool must adapt to changing scientific inquiries by providing a way to isolate data subsets, reduce the visualization's dimensionality, and present it in a more familiar fashion. Our system addresses the multivariate problem by providing variable and physical data views. We use variable visualization to show the relationships and trends among several different variables and also provide an intuitive interface for selecting data items. Physical visualization, however, shows a spatial representation of the data via advanced rendering techniques. In the following, we describe particle data representations and explain the linking and interaction of the views in our system.
Particle Visualization
The physical view is represented with spherical glyphs for a single time step or illuminated path-lines for a range of time steps. In both cases, users choose a single scalar variable, which the system maps to color and opacity for that primitive using a one-dimensional transfer function.


Particle rendering. We've provided the ability to render points with a spherical shape, which improves rendering quality and depth cues over normal point rendering. To sustain interactive rates, we employ view-aligned billboards using hardware-supported point sprites, which other researchers have previously used for flow particles and material point method data. 4 , 5 We still represent particles with a single vertex position, but we can control glyph shape via a 2D texture, which is typically a circle. Hardware shaders provide this circle with a spherical appearance by applying sphere lighting calculations to the surface.

As Figure 1 shows, each particle's color and opacity is controlled by a one-dimensional transfer function. A standard freeform transfer function widget lets users view a data histogram, draw an opacity function, or define multiple color nodes over the data values. Because we use transparency, we must sort the particles before rendering. 4 In terms of memory usage, point sprites don't require any additional storage because we can use the particle positions directly during the rendering stage. Users can increase or decrease the glyph size depending on preference or to reduce overlapping problems. Users can also define particle size to vary with a condition (statistical weight, for example).





Figure 1. Rendered particles using semitransparent, shaded point sprites. (a) Due to particle density, viewing particles of interest is difficult. (b) Our system helps reduce visualization clutter by coloring particles based on parallel velocity from negative (blue) to positive (orange) and by applying a transfer function to reveal high-velocity particles.





Illuminated pathlines. Shaded point sprites provide a detailed view of static particles at a single time step. Animating the particles over time can provide a basic understanding of motion, but tracking individual particles' trajectories beyond a few time steps is difficult. Rendering particle trajectories as pathlines is another potential solution. To be effective, this rendering style must be visually informative and interactive. Consequently, we used current advancements in illuminated line rendering, 6 which is an acceptable trade-off between rendering quality and performance.

To show changes in value over time, we color pathlines with the same transfer function that we used for particles. We can make less important values more transparent to hide unwanted visual information. The rendering method sorts the illuminated pathlines to provide correct blending. 6 We can achieve significant speedups by storing the vertex and color information in vertex buffer objects.

Users can interactively adjust forward and backward line length to help reduce pathline clutter. Animating through the time steps provides further cues on motion and location, letting users visually follow the particle and its trace.

Variable Visualization
The physical view is useful for exploring a single variable, but multivariate exploration requires more than a mere color selection. Users must actively select specific particles based on scientific criteria, which effectively reduces the number of particles rendered and lets users see them. We turn to the popular information visualization technique of parallel coordinates and more general 2D data graphs to assist in finding and selecting particles. Along with an interactive selection scheme, the interface provides valuable information about variable relationships that aren't visible in physical space.


Multivariate selection. Researchers often use parallel coordinates in multivariate studies to display relationships and trends in multiple variables. To improve the quality of our parallel coordinates and avoid oversaturation issues, we use 2D binning to create bin maps, which Matej Novotny and Helwig Hauser 7 used in their work. A 2D bin map exists between every pair of axes on the parallel coordinates plot and acts as a 2D histogram that records the frequency of lines between locations. Our system then renders a global overview of the multivariate data on the parallel coordinates plot, providing more contextual information than typical line rendering. Scalability is another useful consequence of this approach because no matter how many data points we process, the display method uses the bin maps directly to render the parallel coordinates without heavy memory demands. Furthermore, the binning only has to take place when users change to a new time step.

Selecting specific particles now becomes as easy as any brushing operation: from the parallel coordinate plot, users select or deselect regions of each axis using the mouse. As users select these ranges, the corresponding particles become the focus in the parallel coordinates by being rendered as red lines on the plot's forefront. Unlike the global information, which is drawn from bin maps as green quads, the selected particles are drawn with line strips directly from the data values.

The combination of an individual range selection with selections from other axes is controlled by a simple toggle operator, either union or intersection. The default behavior is the union operator, or equivalently, a logical inclusive OR. In union mode, the particles selected by that axis are simply added to the collective selection. If an axis uses the intersection operator, or a logical AND, all selected particles must fall into the intersecting range. Intersecting selections force multiple criteria, which help refine the parameter subset to precise selections. In addition to selection operators, zoom and scale operations can help provide a view that's relevant for every variable.

The ability to refine particle selection by specific scientific inquiry is a powerful tool for exploring data. Because the selection meets a certain criterion, we can test hypotheses to see how these particles correlate with other conditions. In addition, users focus the physical view by choosing to cull away or fade out any unselected particles.



Time-varying variables. Because each parallel coordinate plot only looks at a single time step, we also want to show how particle values change over time. Therefore, users can also view a 2D x-y plot of any variable with our system. As with parallel coordinates, the selected particles are drawn as red lines with semitransparency; the vertical axis represents the variable, and the horizontal axis represents time. Researchers can locate time-dependant patterns, groupings, or any oddities, which might, in turn, reveal a timeframe of interest.

Because the parallel coordinates plot only displays information about a single time step, we provide two ways of locking the current selection before moving to a new timeframe. At each new time step, users can still brush the parallel coordinates to alter the particle collection, but the behavior depends on the current locking mode. Using particle selection lock, the same collection of particles is kept with each time change, regardless of whether the particles' values stay in the previously selected ranges or not. Particle locking mode lets users observe and animate a specific set of particles over time; they can also modify the collection at different time steps by selecting or deselecting particles from the parallel coordinates. Using range selection lock, the collection of particles always reflects the set of variable ranges that were brushed. The number of particles at each time step can change in this mode because particles might leave or enter the locked parameter ranges. Users can alter the locked ranges at any time step using the parallel coordinates plot to create a new selection.

Interaction with Linked Views
The two types of visualization—physical and variable—are presented as separate windows within the same application, along with a small control panel for various program options. With the ability to view the data in several different views simultaneously, users gain a better understanding of the data itself. The system's real benefit, however, is its ability to interactively explore the data by using the visualizations themselves as user interfaces. Whenever a change is made to any one view, the effects are propagated to other views. In this fashion, users can employ some tools to overcome the shortcomings of others—for example, a transfer function is limited in its capacity to reveal multivariate connections, and as such, users can use the parallel coordinates view to focus on particles with deeper connections. As another example, 2D time plots and 3D colored pathlines can convey a variable's changes over time, but only pathlines can provide an intuitive understanding of complex particle trajectories.
The flexibility of the parallel coordinates to handle additional dimensions with ease is an important aspect of the system. Any scientist is able to continuously modify and expand the set of formulas used within the application, and with each new formula, gain an additional dimension with which to control the exploration. Users can add certain functionality as a variable, such as culling in toroidal coordinate space, and control it by either parallel coordinates or transfer function.
Results
Numerical calculations on plasma particle data have led to many interesting observations, but such calculations don't benefit from visual comprehension or interactive exploration. We discuss several examples of how users can manipulate this system to explore particle data along with some significant findings.
For these examples, we use a numerically simulated plasma data set containing 1 million particles and 1,500 time steps, totaling more than 40 Gbytes of data. A particle is represented by its position in 3D, its velocity parallel to the field, its magnetic moment, and its statistical weight. For each example, the parallel coordinates are toroidal radial distance, trapped condition, parallel velocity, statistical weight, magnetic moment, and distance from center. Part of our future work for this system is performance optimization although even in its current state, it can render all 1 million particles at more than five frames per second with a 2.33-GHz CPU and an Nvidia GeForce 8800. As the number of data items increases, we will need to place more emphasis on multithread and multiprocessor support.
Particles vs. Pathlines
Figure 2 shows the differences between particle glyphs and pathlines. Viewing a single time step is only possible with glyph rendering, but the view also lets users step forward or backward through time to see particle movement. This approach is useful for seeing how a selection at one time step differs from another with the range selection lock function ( Figure 2 b) and for seeing specific particles' discrete changes at different time steps. Figure 2 c shows the change in particle locations between time step 1 ( Figure 2 a) and time step 10. Pathline rendering, on the other hand, can provide a more general understanding of motion and changes over a specified interval. Using the same selected particles, Figure 2 d shows their paths between time steps 1 and 20. We can see how the particles disperse as the simulation begins.




Figure 2. Particle glyphs and pathlines. (a) Using parallel coordinates, we choose particles close to and far from the center. The transfer function colors the particles based on their distance from the center: green (close) to blue (far). After progressing to (b) time step 10, the range selection lock function creates a new particle selection that's close to and far from the center. If we progress to time step 10 with particle selection lock in, (c) we can see the originally selected particles' new locations, and (d) the particle pathlines chosen in (a) between time steps 1 and 20.



Exploring Multiple Variables
Using a 1D transfer function, we can't represent more than one variable on each particle. However, we can restrict this single value by a range of other variables, including derived values. For example, the bottom variable in the parallel coordinate plot in Figure 3 a is the projected distance from the torus's center, which is calculated using a combination of the x and y coordinates. In Figure 3 b, we select the high values of magnetic moment on the parallel coordinates interface, so the selected particles become visible in both views. Figure 3 c shows an additional set of particles, in which the particles that are positioned far from the torus's center are brushed. By changing both axes to the intersection operator in Figure 3 d, the result is a particle selection that has the properties of previous selections.




Figure 3. Exploring particles using parallel coordinates. We color the particles by magnetic moment from blue to white to orange for increasing value and brush each axis of the parallel coordinates to select and deselect particles based on specific value ranges. (a) The plot begins without any particles selected, so we select a range of high magnetic moment located on (b) the fifth axis. (c) We add a portion of outer radius particles to the last selection by brushing the bottom axis. We then intersect the two selections (d) using the axis operators. The resulting particles have high magnetic moment and are located at the torus's outer edge.



The interface provides an interactive method of data exploration that users can couple with numerical calculations. Users can study several different conditions at various time steps, or they can specify a single condition and track the particles as they move in and out of it. By using on-the-fly parameter culling, users specifically select a small subset of particles to visualize, which eliminates or reduces the need to preprocess data. This process becomes exceedingly important as the number of particles, and subsequently the data size, increases.
Exploring Pathlines
Pathline rendering aids in the understanding of time-dependent changes because it relates motion, location, and value changes. The pathlines of plasma particles can sometimes become cluttered due to dense particles and overlapping trajectories. However, the system can map any of the defined variables to the pathlines' color and opacity using the transfer function. Consequently, our system gives users some control over interactively finding regions of interest.
Figure 4 illustrates one possible scenario in which the transfer function assists in pathline exploration. The trapped particles in Figure 4 a have short, complicated paths; the first image displays all the pathlines using parallel velocity as the coloring variable. The particles seem to exhibit interesting paths, but the high amount of clutter obstructs much of the information. Therefore, we map the transfer function to the radial distance coordinate instead, which lets us effectively see the trajectory patterns at different layers of the torus by gradually increasing the opacity over increasing values. Figure 4 b shows the innermost line portions with full opacity, whereas the surrounding areas are mapped to semitransparent. This location's trajectories become more apparent, so to further our investigation, we add an additional layer in Figure 4 c. This exploration continues until, finally, in Figure 4 d, we see the outer layer in detail while hiding the other ones underneath. This example shows how to find spatially important features, such as pathlines disturbed by turbulent flow, in a dauntingly complex collection of rendered objects.




Figure 4. A dense collection of pathlines using a transfer function mapped to radial distance. We select certain low-velocity particles and generate their paths over a small, interesting time interval. (a) The complex lines are colored by parallel velocity. (b) through (d) By changing this transfer function to some culling parameter, such as the radial coordinate r, we can explore the lines one layer at a time.



Trapped Particles
An interesting feature in plasma simulations is the trapping of particles due to turbulent flow. To better understand this effect, we isolate trapped particles with the parallel coordinates and trace their paths. To identify trapped particles, our system uses a derived formula to classify particles with values between -1 and 1 as trapped. Trapped particles' pathlines exhibit a sudden change in direction as the parallel velocity changes sign.
Figure 5 a shows several trapped particles' extended trajectories. Beginning with a large set, the collection of trapped particles is refined over a period of 50 time steps by removing any particles that no longer exhibit the trapped property. Thus, the result is a small set of particles that constantly changes direction during the same interval. By coloring the pathline based on parallel velocity, the sudden change in direction is highlighted by white, indicating near zero parallel velocity.




Figure 5. A subset of particles that meet the trapped particle condition with low parallel velocity. (a) We use parallel velocity to draw pathlines of 50 time steps in length in which blue is negative, white is near zero, and orange is positive. We can identify where the parallel velocity changes sign—for example, from blue to orange and vice versa—and where the particles change direction in the simulation. (b) We clamp the trapped particle equation to the desired range (-1, 1) for easier selection. We map the transfer function to the trapped particle condition itself, in which red or yellow indicate trapped particles and green indicates not trapped. We can observe the location and length of several trapped particles over 100 time steps.



An interesting application of the particle selection lock function with the parallel coordinate view is to create an animation that follows a specific particle collection. In the context of trapped particles, it's possible to select the particles trapped at a particular time step within the data, and after turning on the particle selection lock, study how the particles stay correlated and see what waves they're riding by animating the particles and their trailing pathlines. In addition, users can make further changes to the selection at another time step to study other time-varying features, such as trapped particles that stay consistently near zero parallel velocity.
When looking for specific values, it's useful to focus directly on that range of interest. Figure 5 b shows a modified trapped particle value on the second axis of the parallel coordinates, to clamp values outside the desired range to the axis's edges. By doing this, the focus of the axis becomes the trapped particles, making it easier to identify trapped from not trapped. Besides making the selection easier, users can employ the transfer function to color the particles and pathlines by the same trapped condition variable. By observing the pathline color, it's easy to identify when particles are trapped. At the figure's forefront, users can track several particles entering trapped states within a small neighborhood of each other, which indicates a possible commonality.
Conclusion
As PPPL researchers continue to improve GTC by adding more physics and new algorithms, they will undertake even more challenging simulations, which will continue to increase the understanding of fusion device properties and could ultimately lead to predictive capability. GTC is also a frontrunner on the road to petascale computing, and the extremely large data sets generated by such simulations will greatly benefit from the advanced visualization tool we describe in this article to navigate particle data's multiple dimensions.
In the future, we plan to optimize and apply our system to much larger examples in an effort to assist more recent simulation runs. In addition, we will investigate new approaches to visualizing multivariate information in the context of 3D exploration. One key issue to improve is particle selection in the time dimension because it might reveal more information about particle properties. An additional aspect of importance is the ability to explore and correlate data from volume scalar fields and particle point data.
Acknowledgments
This work is sponsored in part by the US Department of Energy's SciDAC program and the US National Science Foundation's Information Technology Research (ITR) program.

References

Chad Jones is a PhD student at the University of California, Davis. He's also researching multivariate and multifield visualization with the SciDAC Institute for Ultrascale Visualization. Jones has a BS in computer science and mathematics from the University of Tennessee. He is a student member of the IEEE. Contact him at cejjones@ucdavis.edu.
Kwan-Liu Ma is a professor of computer science at the University of California, Davis, and directs the US Department of Energy's SciDAC Institute for Ultrascale Visualization. His research spans the fields of visualization, high-performance computing, and user-interface design. Ma has a PhD in computer science from the University of Utah. He also serves on the editorial boards of IEEE Computer Graphics and Applications and IEEE Transactions on Visualization and Graphics. He is a senior member of the IEEE. Contact him at ma@cs.ucdavis.edu.
Stéphane Ethier is a computational physicist in the Computational Plasma Physics Group at the Princeton Plasma Physics Laboratory. His current research involves high-performance computing and large-scale gyrokinetic particle-in-cell simulations of microturbulence in magnetic confinement fusion devices. Ethier has PhD from the Department of Energy and Materials of the Institut National de la Recherche Scientifique (INRS) in Montreal, Canada. Contact him at ethier@pppl.gov.
Wei-Li Lee is a distinguished laboratory fellow at the Princeton Plasma Physics Laboratory. His research specialty is in the area of theory and simulation of magnetic fusion plasmas and high-intensity relativistic beams. Lee has a PhD in physics from Northwestern University. He is a fellow of the American Physical Society. Contact him at wwlee@pppl.gov.