This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Automatic ARIMA Time Series Modeling for Adaptive I/O Prefetching
April 2004 (vol. 15 no. 4)
pp. 362-377

Abstract—Inadequate I/O performance remains a major challenge in using high-end computing systems effectively. To address this problem, the paper presents TsModeler, an automatic time series modeling and prediction framework for adaptive I/O prefetching that uses ARIMA time series models to predict the temporal patterns of I/O requests. These online pattern analysis techniques and cutoff indicators for autocorrelation patterns enable multistep online predictions suitable for multiblock prefetching. This work also combines time series predictions with spatial Markov model predictions to determine when, what, and how many blocks to prefetch. Experimental results show reductions in execution time compared to the standard Linux file system across various hardware configurations.

[1] D.A. Patterson and J.L. Hennessy, Computer Architecture: A Quantitative Approach. San Francisco, Calif.: Morgan Kaufmann, second ed., 1996.
[2] C. Ruemmler and J. Wilkes, "An Introduction to Disk Drive Modeling," Computer, vol. 27, no. 3, pp. 17-28, Mar. 1994.
[3] T. Kimbrel, A. Tomkins, R. Patterson, B. Bershad, P. Cao, E. Felten, G. Gibson, A. Karlin, and K. Li, A Trace-Driven Comparison of Algorithms for Parallel Prefetching and Caching Proc. Second USENIX Symp. Operating Systems Design and Implementation, pp. 19-34, Oct. 1996.
[4] H. Patterson, G. Gibson, E. Ginting, D. Stodolsky, and J. Zelenka, Informed Prefetching and Caching Proc. 15th ACM Symp. OS Principles, Dec. 1995.
[5] P. Cao, E.W. Felten, A.R. Karlin, and K. Li, A Study of Integrated Prefetching and Caching Strategies Proc. ACM Sigmetrics, pp. 188-197, May 1995.
[6] P.E. Crandall, R.A. Aydt, A.A. Chien, and D.A. Reed, Characterization of a Suite of Input/Output Intensive Applications Proc. Supercomputing Conf., Dec. 1995.
[7] E. Smirni and D.A. Reed, Workload Characterization of Input Output Intensive Parallel Applications Proc. Conf. Modeling Techniques and Tools for Computer Performance Evaluation, vol. 1245, June 1997.
[8] J.P. Oly and D.A. Reed, Markov Model Predictions of I/O Requests for Scientific Applications Proc. 16th ACM Int'l Conf. Supercomputing, 2002.
[9] N. Tran and D.A. Reed, ARIMA Time Series Modeling and Forecasting for Adaptive Prefetching Proc. ACM Int'l Conf. Supercomputing, June 2001.
[10] C. Winstead and V. McCoy, Studies of Electron-Molecule Collisions on Massively Parallel Computers Modern Electronic Structure Theory, World Scientific, vol. 2, 1994.
[11] D.A. Reed et al., "An Overview of the Pablo Performance Analysis Environment," Proc. Scalable Parallel Libraries Conf., IEEE Computer Society Press, Los Alamitos, Calif., Oct. 1994, pp. 104-113.
[12] G.E. Box, G.M. Jenkins, and G.C. Reinsel, Time Series Analysis Forecasting and Control, third ed. Prentice-Hall, 1994.
[13] K.J. Astrom and B. Wittenmark, Adaptive Control. Addison-Wesley, 1989.
[14] H.-F. Chen and L. Guo, Identification and Stochastic Adaptive Control. Boston: Birkhauser, 1991.
[15] A. Pankratz, Forecasting with Univariate Box-Jenkins Models, Concepts and Cases. John Wiley and Sons, 1983.
[16] M.S. Barlett, On the Theoretical Specification of Sampling Properties of Autocorrelated Time Series J. Royal Statistical Soc., vol. B8, 1946.
[17] R.D. Henderson and G.E. Karniadakis, Unstructured Spectral Element Methods for Simulation of Turbulent Flows J. Computational Physics 122, vol. 2, 1995.
[18] M.G. Kendall, On Autoregressive Time Series Biometrika, vol. B33, no. 2, 1944.
[19] J. Durbin, The Fitting of Time Series Models Rev. of the Int'l Inst. of Statistics, vol. 28, 1960.
[20] G. Strang and T. Nguyen, Wavelets and Filter Banks. Wellesley-Cambridge Press, 1997.
[21] W. Sweldens and P. Schroder, Building Your Own Wavelets at Home Wavelets in Computer Graphics, ACM SIGGRAPH Course Notes, pp. 15-87, 1996.
[22] R.J. Wonnacott and T.H. Wonnacott, Introductory Statistics. John Wiley and Sons, 1990.
[23] L. Ljung and T. Soderstrom, Theory and Practice of Recursive Identification. Cambridge, Massachusetts Inst. of Technology Press, 1983.
[24] H. Simitci, D.A. Reed, R. Fox, M. Medina, J. Oly, N. Tran, and G. Wang, A Framework for Adaptive Storage Input/Output on Computational Grids Proc. Third Workshop Runtime Systems for Parallel Programming (RTSPP), Apr. 1999.
[25] A. Tomkins, R. Patterson, and G. Gibson, Informed Multi-Process Prefetching and Caching Proc. ACM Int'l Conf. Measurement and Modeling of Computer Systems (Sigmetrics), June 1997.
[26] E. Seidel et al., The Cactus Code. NCSA and Max Planck Inst. for Gravitational Physics,http:/www.cactuscode.org, 2002.

Index Terms:
Input/output, adaptive prefetching, access patterns, performance modeling and prediction, time series analysis, pattern analysis, least squares methods, wavelets.
Citation:
Nancy Tran, Daniel A. Reed, "Automatic ARIMA Time Series Modeling for Adaptive I/O Prefetching," IEEE Transactions on Parallel and Distributed Systems, vol. 15, no. 4, pp. 362-377, April 2004, doi:10.1109/TPDS.2004.1271185
Usage of this product signifies your acceptance of the Terms of Use.