This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
ADMiRe: An Algebraic Data Mining Approach to System Performance Analysis
July 2005 (vol. 17 no. 7)
pp. 888-901
Performance analysis of computing systems is an increasingly difficult task due to growing system complexity. Traditional tools rely on ad hoc procedures. With these, determining which of the manifold system and workload parameters to examine is often a lengthy and highly speculative process. The analysis is often incomplete and, therefore, prone to revealing faulty conclusions and not uncovering useful tuning knowledge. We address this problem by introducing a data mining approach called ADMiRe (Analyzer for Data Mining Results). In this scheme, regression analysis is first applied to performance data to discover correlations between various system and workload parameters. The results of this analysis are summarized in sets of regression rules. The user can then formulate intuitive algebraic expressions to manipulate these sets of rules to capture critical information. To demonstrate this approach, we use ADMiRe to analyze an Oracle database system running the TPC-C (Transaction Processing Performance Council) benchmark. The results generated by ADMiRe were confirmed by Oracle experts. We also show that by applying ADMiRe to Microsoft Internet Information Server performance data, we can improve system performance by 20 percent.

[1] T. Abraham and J.F. Roddick, “Incremental Meta-Mining from Large Temporal Data Sets,” Advances in Database Technologies, Proc. First Int'l Workshop Data Warehousing and Data Mining, 1998.
[2] J.F. Allen, “Maintaining Knowledge about Temporal Intervals,” Comm. ACM, vol. 26, no. 11, pp. 832-843, 1983.
[3] C.C. Aggarwal, J.L. Wolf, K.-L. Wu, and P.S. Yu, “The Intelligent Recommendation Analyzer,” Proc. ICDCS Workshop Knowledge Discovery and Data Mining in the World-Wide Web, 2000.
[4] R. Agrawal, S. Gosh, T. Imielinski, B. Iyer, and A. Swami, “An Interval Classifier for Databases Mining Applications,” Proc. 1992 Very Large Databases Conf., 1992.
[5] R. Agrawal, T. Imielinski, and A. Swami, “Mining Association Rules between Sets of Items in Large Databases,” Proc. ACM SIGMOD Management of Data, 1993.
[6] R. Agrawal, C. Faloutsos, and A. Swami, “Efficiency Similarity Search in Sequence Databases,” Proc. Conf. Foundations of Data Organization, 1993.
[7] R. Agrawal and R. Srikant, “Fast Algorithms for Mining Association Rules,” Proc. 20th Very Large Data Bases Conf., 1994.
[8] M.F. Arlitt and C.L. Williamson, “Web Server Workload Characterization: The Search for Invariants,” Proc. ACM SIGMETRICS Conf. Measurement and Modeling of Computer Systems, May 1996.
[9] R. Agrawal and R. Srikant, “Mining Quantitative Association Rules in Large Relational Tables,” Proc. 1996 ACM SIGMOD Int'l Conf. Management of Data, 1996.
[10] L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone, Classification and Regression Trees. Chapman & Hall, 1998.
[11] S.D. Gribble, G.S. Manku, D. Roselli, E.A. Brewer, T.J. Gibson, and E.L. Miller, “Self-Similarity in File Systems,” Proc. SIGMETRICS '98, 1998.
[12] P.M. Chen and D.A. Patterson, “A New Approach to I/O Performance Evaluation: Self-Scaling I/O Benchmarks, Predicted I/O Performance,” Proc. 1993 ACM SIGMETRICS Conf. Measurement and Modeling of Computer Systems, pp. 1-12, 1993.
[13] K. Gottry, “Successful Solaris Tuning,” SysAdmin, the J. UNIX Systems and Administrators, 2001.
[14] J. Han, G. Dong, and Y. Yin, “Efficient Mining of Partial Periodic Patterns in Time Series Database,“ Proc. 15th Int'l Conf. Data Eng., 1999.
[15] M. Houtsma and A. Swami, “Set-Oriented Mining for Association Rules in Relational Databases,” Proc. 11th Int'l Conf. Data Eng., 1995.
[16] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Springer, 2001.
[17] N. Jiang, K.A. Hua, R. Villafane, and D.A. Tran, “ADMiRe: An Algebraic Approach to System Performance Using Data Mining Techniques,” Proc. 2003 ACM Symp. Applied Computing, 2003.
[18] A. Joshi, W. Bridge, J. Loaiza, and T. Lahiri, “Checkpointing in Oracle,” Proc. 1998 Very Large Data Bases Conf., 1998.
[19] T. Lahiri, A. Ganesh, R. Weiss, and A. Joshi, “Fast-Start: Quick Fault Recovery in Oracle,” Proc. 2001 ACM SIGMOD Int'l Conf. Management of Data, 2001.
[20] A. Merchant, K.-L. Wu, P.S. Yu, and M.-S. Chen, “Performance Analysis of Dynamic Finite Versioning Schemes: Storage Cost vs. Obsolescence,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 7, pp. 985-1001, Dec. 1996.
[21] R. Ramakrishnan, Database Management Systems. WCB/McGraw-Hill, 2002.
[22] A. Silberschatz and A. Tuzhilin, “What Makes Patterns Interesting in Knowledge Discovery Systems,” IEEE Trans. Knowledge and Data Eng., vol. 8, no. 6, pp. 970-974, 1996.
[23] H. Wang, W. Wang, J. Yang, and P.S. Yu, “Clustering by Pattern Similarity in Large Data Sets,” Proc. 2002 ACM SIGMOD Int'l Conf. Management of Data, 2002.
[24] K.-L. Wu, P.S. Yu, J.-Y. Chung, and J.Z. Teng, “Workfile Disk Management for Concurrent Mergesorts in a Multiprocessor Database System,” Distributed and Parallel Databases, vol. 8, no. 3, pp. 279-296, July 2000.
[25] K.-L. Wu and P.S. Yu, “Load-Balancing and Hot Spot Relief for Hash Routing among a Collection of Proxy Caches,” Proc. 19th Int'l Conf. Distributed Computing, pp. 536-543, May 1999.
[26] J. Yang, W. Wang, and P.S. Yu, “Infominer: Mining Surprising Periodic Patterns,” Proc. Seventh ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, 2001.
[27] McAfee, http://vil.nai.com/vilstinger/, 2005.
[28] Microsoft Developer Network, http:/msdn.microsoft.com, 2005.
[29] Microsoft, “Internet Information Server 6.0 Technical Reference,” http://www.microsoft.com/resources/documentation/ IIS/6/all/techref/en-usiisRG_COU_4.mspx , 2005.
[30] Oracle, “Oracle 8i Tuning, Release 8.1.5,” Part No. A67775-01, Oracle Corporation, 1999.
[31] Oracle, “Oracle Enterprise Manager Database Tuning with the Oracle Tuning Pack,” Release 9.0.1, Part Number A86647-01, Oracle Corporation, 2001.
[32] Symantech, http:/www.symantech.com, 2005.
[33] Transaction Processing Performance Council, “TPC Benchmark C Standard Specification Revision 5.0,” http://www.tpc.org/tpcc/spectpcc_current.pdf , 2005.
[34] http://www.mcsr.olemiss.edu/cgi-binman-cgi?sar+1 , 2003.
[35] PerfStat/PerfAlert, www.instrumental.com, 2005.

Index Terms:
Index Terms- Data mining, performance of systems, algorithms for data and knowledge management.
Citation:
Ning Jiang, Roy Villafane, Kien A. Hua, Abhijit Sawant, Kiran Prabhakara, "ADMiRe: An Algebraic Data Mining Approach to System Performance Analysis," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 7, pp. 888-901, July 2005, doi:10.1109/TKDE.2005.103
Usage of this product signifies your acceptance of the Terms of Use.