2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC) (2012)
Salt Lake City, UT
Nov. 10, 2012 to Nov. 16, 2012
Emerging scientific simulations on leadership class systems are generating huge amounts of data. However, the increasing gap between computation and disk I/O speeds makes traditional data analytics pipelines based on post-processing cost prohibitive and often infeasible. In this paper, we investigate an alternate approach that aims to bring the analytics closer to the data using data staging and the in-situ execution of data analysis operations. Specifically, we present the design, implementation and evaluation of a framework that can support in-situ feature-based object tracking on distributed scientific datasets. Central to this framework is the scalable decentralized and online clustering (DOC) and cluster tracking algorithm, which executes in-situ (on different cores) and in parallel with the simulation processes, and retrieves data from the simulations directly via on-chip shared memory. The results from our experimental evaluation demonstrate that the in-situ approach significantly reduces the cost of data movement, that the presented framework can support scalable feature-based object tracking, and that it can be effectively used for in-situ analytics for large scale simulations.
data analysis, feature extraction, microprocessor chips, object tracking, pattern clustering, shared memory systems
F. Zhang, S. Lasluisa, T. Jin, I. Rodero, H. Bui and M. Parashar, "In-situ Feature-Based Objects Tracking for Large-Scale Scientific Simulations," 2012 IEEE International Conference on Services Computing (SCC), Honolulu, HI, 2013, pp. 736-740.