The Community for Technology Leaders
2012 IEEE 8th International Conference on E-Science (2012)
Chicago, IL, USA USA
Oct. 8, 2012 to Oct. 12, 2012
ISBN: 978-1-4673-4467-8
pp: 1-8
Jong Youl Choi , Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Hasan Abbasi , Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
David Pugmire , Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Norbert Podhorszki , Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Scott Klasky , Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Cristian Capdevila , Electrical Engineering and Computer Science, The University of Tennessee, Knoxville, Tennessee, USA
Manish Parashar , Electrical and Computer Engineering, Rutgers University, Piscataway, New Jersey, USA
Matthew Wolf , School of Computer Science, Georgia Institute of Technology, Atlanta, Georgia, USA
Judy Qiu , School of Informatics and Computing, Indiana University, Bloomington, Indiana, USA
Geoffrey Fox , School of Informatics and Computing, Indiana University, Bloomington, Indiana, USA
ABSTRACT
Predictive pre-fetcher, which predicts future data access events and loads the data before users requests, has been widely studied, especially in file systems or web contents servers, to reduce data load latency. Especially in scientific data visualization, pre-fetching can reduce the IO waiting time. In order to increase the accuracy, we apply a data mining technique to extract hidden information. More specifically, we apply a data mining technique for discovering the hidden contexts in data access patterns and make prediction based on the inferred context to boost the accuracy. In particular, we performed Probabilistic Latent Semantic Analysis (PLSA), a mixture model based algorithm popular in the text mining area, to mine hidden contexts from the collected user access patterns and, then, we run a predictor within the discovered context. We further improve PLSA by applying the Deterministic Annealing (DA) method to overcome the local optimum problem. In this paper we demonstrate how we can apply PLSA and DA optimization to mine hidden contexts from users data access patterns and improve predictive pre-fetcher performance.
INDEX TERMS
Context, Prediction algorithms, Clustering algorithms, Accuracy, Data mining, Data visualization, Algorithm design and analysis, hidden context mining, prefetch
CITATION
Jong Youl Choi, Hasan Abbasi, David Pugmire, Norbert Podhorszki, Scott Klasky, Cristian Capdevila, Manish Parashar, Matthew Wolf, Judy Qiu, Geoffrey Fox, "Mining hidden mixture context with ADIOS-P to improve predictive pre-fetcher accuracy", 2012 IEEE 8th International Conference on E-Science, vol. 00, no. , pp. 1-8, 2012, doi:10.1109/eScience.2012.6404418
83 ms
(Ver 3.3 (11022016))