This Article 
 Bibliographic References 
 Add to: 
Visual Data Mining in Large Geospatial Point Sets
September/October 2004 (vol. 24 no. 5)
pp. 36-44
Daniel A. Keim, University of Constance, Germany
Christian Panse, University of Constance, Germany
Mike Sips, University of Constance, Germany
Stephen C. North, AT&T Labs
The information revolution is creating and publishing vast data sets, such as records of business transactions, environmental statistics, and census demographics. In human versus application domains, this data is collected and indexed by geospatial location. The discovery of interesting patterns in such databases through spatial data mining is a key to turning this raw data into valuable information. Challenges arise because newly available geospatial data sets often have millions of records, or even more. New techniques are needed to cope with this scale. The Wide Area Layout Data Observer (Waldo) is a novel visual data mining system, based on PixelMaps, for analyzing large geospatial data sets. PixelMaps combine density-based distortion of map regions with local pixel repositioning to highlight clusters and avoid data loss from over plotting. To enhance data exploration, Waldo involves the human in cluster discovery.

1. A.S. Fotheringham and P. Rogerson, Spatial Analysis and GIS, Taylor and Francis, 1994.
2. K. Koperski, J. Adhikary, and J. Han, "Spatial Data Mining: Progress and Challenges," Research Issues on Data Mining and Knowledge Discovery, ACM Press , 1996.
3. D.A. Keim et al., "Pushing the Limit in Visual Data Exploration: Techniques and Applications," Proc. Advances in Artificial Intelligence, 26th Ann. German Conf. AI, LNAI 2821, Springer-Verlag, 2003, pp. 37-51.
4. D.A. Keim, C. Panse, and M. Sips, "Information Visualization: Scope, Techniques, and Opportunities for Geovisualization," to be published in Exploring Geovisualization, J. Dykes, A. MacEachren, and M.-J.Kraak, eds., Elsevier, 2004, pp. 15-44.
5. D.A. Keim et al., "PixelMaps: A New Visual Data Mining Approach for Analyzing Large Spatial Data Sets," Proc. 3rd IEEE Int'l Conf. Data Mining (ICDM 03), IEEE CS Press, 2003, pp. 565-568.
6. B. Shneiderman, "The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations," Proc. IEEE Symp. Visual Languages, IEEE CS Press, 1996, pp. 336-343.
7. D.A. Keim et al., "Pixel Based Visual Mining of Geospatial Data," Computers and Graphics (CAG), vol. 28, no. 3, June 2004, pp. 327-344.
1. D.A. Keim and A. Herrmann, "The Gridfit Algorithm: An Efficient and Effective Approach to Visualizing Large Amounts of Spatial Data," Proc. IEEE Visualization Conf., IEEE CS Press, 1998, pp. 181-188.
2. D.A. Keim, S.C. North, and C. Panse, "Cartodraw: A Fast Algorithm for Generating Contiguous Cartograms," IEEE Trans. Visualization and Computer Graphics (TVCG), vol. 10, no. 1, 2004, pp. 95-110.

Daniel A. Keim, Christian Panse, Mike Sips, Stephen C. North, "Visual Data Mining in Large Geospatial Point Sets," IEEE Computer Graphics and Applications, vol. 24, no. 5, pp. 36-44, Sept.-Oct. 2004, doi:10.1109/MCG.2004.41
Usage of this product signifies your acceptance of the Terms of Use.