Subscribe

Issue No.12 - Dec. (2011 vol.17)

pp: 2572-2580

ZhenMin Peng , Swansea University, UK

Zhao Geng , Swansea University, UK

Jonathan C. Roberts , Bangore University, UK

Rick Walker , Bangor University, UK

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2011.166

ABSTRACT

Parallel coordinates is a popular and well-known multivariate data visualization technique. However, one of their inherent limitations has to do with the rendering of very large data sets. This often causes an overplotting problem and the goal of the visual information seeking mantra is hampered because of a cluttered overview and non-interactive update rates. In this paper, we propose two novel solutions, namely, angular histograms and attribute curves. These techniques are frequency-based approaches to large, high-dimensional data visualization. They are able to convey both the density of underlying polylines and their slopes. Angular histogram and attribute curves offer an intuitive way for the user to explore the clustering, linear correlations and outliers in large data sets without the over-plotting and clutter problems associated with traditional parallel coordinates. We demonstrate the results on a wide variety of data sets including real-world, high-dimensional biological data. Finally, we compare our methods with the other popular frequency-based algorithms.

INDEX TERMS

Parallel Coordinates, Angular Histogram, Attribute Curves.

CITATION

ZhenMin Peng, Zhao Geng, Jonathan C. Roberts, Rick Walker, "Angular Histograms: Frequency-Based Visualizations for Large, High Dimensional Data",

*IEEE Transactions on Visualization & Computer Graphics*, vol.17, no. 12, pp. 2572-2580, Dec. 2011, doi:10.1109/TVCG.2011.166REFERENCES

- [1] A. O. Artero, M. C. F. de Oliveira, and H. Levkowitz, Uncovering Clusters in Crowded Parallel Coordinates Visualizations.
In IEEE Information Visualization Conference, pages 81–88. IEEE Computer Society, 2004.- [2] J. Blaas , C. P. Botha, and F. H. Post, Extensions of Parallel Coordinates for Interactive Exploration of Large Multi-Timepoint Data Sets.
IEEE Transactions on Visualization and Computer Graphics, 14 (6): 1436–1451, 2008.- [3] D. Carr, Looking at Large Data Sets Using Binned Data Plots.
Computing and Graphics in Statistics, ed. by A., Buja, P.A, Turkey , pages 7–39, 1991.- [4] A. Dasgupta, and R. Kosara, Pargnostics: Screen-Space Metrics for Parallel Coordinates.
IEEE Transaction on Visualization and Computer Graphics, 16 (6): 1017–1026, 2010.- [5] M. C. F. de Oliveira and H. Levkowitz, From Visual Data Exploration to Visual Data Mining: A Survey.
IEEE Transactions on Visualization and Computer Graphics, 9 (3): 378–394, 2003.- [6] G. Ellis and A. Dix, A Taxonomy of Clutter Reduction for Information Visualisation.
IEEE Transactions on Visualization and Computer Graphics, 13 (6): 1216–1223, 2007.- [7] G. Ellis and A. J. Dix, Enabling Automatic Clutter Reduction in Parallel Coordinate Plots.
IEEE Transactions on Visualization and Computer Graphics, 12 (5): 717–724, 2006.- [8] Y.-H. Fua, M. O. Ward, and E. A. Rundensteiner, Hierarchical Parallel Coordinates for Exploration of Large Datasets.
In IEEE Visualization, pages 43–50, 1999.- [9] E. Grundy, M. W. Jones, R. S. Laramee, R. P. Wilson, and E. L. C. Shepard, Visualisation of Sensor Data from Animal Movement.
Computer Graphics Forum, 28 (3): 815–822, 2009.- [10] J. Han and M. Kamber,
Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2005.- [11] H. Hauser, F. Ledermann, and H. Doleisch, Angular Brushing of Extended Parallel Coordinates.
In Proceedings of IEEE Symposium on Information Visualization, pages 127–130. IEEE Computer Society, 2002.- [12] InfoChimps. Daily 1970-2010 Open, Close, Hi, Low and Volume (NYSE exchange), 2011. http://www.infochimps.comdatasets/, NASDAQ Exchange Daily 1970-2010 Open, Close, High, Low and Volume, Last Access Date: 2011-3-16.
- [13] A. Inselberg,
Parallel Coordinates: Visual Multidimensional Geometry and Its Applications. Springer, 2009.- [14] A. Inselberg and B. Dimsdale, Parallel Coordinates: A Tool for Visualizing Multi-dimensional Geometry.
In Proceedings of IEEE Visualization, pages 361–378, 1990.- [15] J. Johansson, P. Ljung, M. Jern, and M. Cooper, Revealing Structure within Clustered Parallel Coordinates Displays.
In IEEE Information Visualization Conference, pages 17–25. IEEE Computer Society, 2005.- [16] D. A. Keim and H.-P. Kriegel, Visualization techniques for Mining Large Databases: A Comparison.
IEEE Transactions on Knowledge and Data Engineering, 8 (6): 923–938, 1996.- [17] R. Kosara, F. Bendix, and H. Hauser, TimeHistograms for Large, Time-Dependent Data.
In Joint EUROGRAPHICS-IEEE TCVG Symposium on Visualization, pages 45–54, 340. Eurographics Association, 2004.- [18] J. Li, J.-B. Martens, and J. J. van Wijk, Judging correlation from scatterplots and parallel coordinate plots.
Information Visualization, 9 (1): 13–30, 2010.- [19] M. Novotny and H. Hauser, Outlier-Preserving Focus+Context Visualization in Parallel Coordinates.
IEEE Transactions on Visualization and Computer Graphics, 12 (5): 893–900, 2006.- [20] J. F. Rodrigues, A. J. M. Traina, and C. Traina, Frequency Plot and Relevance Plot to Enhance Visual Data Exploration.
In SIBGRAPI, pages 117–124. IEEE Computer Society, 2003.- [21] O. Ruebel and W. K., High Performance Multivariate Visual Data Exploration for Extremely Large Data. Lawrence Berkeley National Laboratory, Barkeley, CA 94720, USA, 2008.
- [22] B. Shneiderman, The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations.
In Proceedings of 1996 IEEE Symposium on Visual Languages, pages 336–343, 1996.- [23] B. W. Silverman, Kernel Density Estimation Technique for Statistics and Data Analysis.
In Monographs on statistics and applied probability, volume 26.Chapman and Hall, 1986.- [24] A. Unwin, M. Theus, and H. Hofmann,
Graphics of Large Datasets: Visualizing a Million (Statistics and Computing). Springer, 2006.- [25] E. J. Wegman, Hyperdimensional Data Analysis Using Parallel Coordinates.
Journal of the American Statistical Association, 85 (411): 664–672, 1990.- [26] E. J. Wegman and Q. Luo, High Dimensional Clustering Using Parallel Coordinates and the Grand Tour.
Computing Science and Statistics, 28: 352–360, 1997.- [27] G. J. Wills, Selection: 524,288 Ways to Say “This is Interesting”.
In Proceedings of the IEEE Symposium on Information Visualization, pages 54–61. IEEE, 1996.- [28] P. C. Wong and R. D. Bergeron, 30 Years of Multidimensional Multivari-ate Visualization.
In Scientific Visualization, pages 3–33. IEEE Computer Society, 1994. |