Issue No. 07 - July (2005 vol. 17)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2005.103
Ning Jiang , IEEE
Roy Villafane , IEEE
Kien A. Hua , IEEE
Performance analysis of computing systems is an increasingly difficult task due to growing system complexity. Traditional tools rely on ad hoc procedures. With these, determining which of the manifold system and workload parameters to examine is often a lengthy and highly speculative process. The analysis is often incomplete and, therefore, prone to revealing faulty conclusions and not uncovering useful tuning knowledge. We address this problem by introducing a data mining approach called ADMiRe (Analyzer for Data Mining Results). In this scheme, regression analysis is first applied to performance data to discover correlations between various system and workload parameters. The results of this analysis are summarized in sets of regression rules. The user can then formulate intuitive algebraic expressions to manipulate these sets of rules to capture critical information. To demonstrate this approach, we use ADMiRe to analyze an Oracle database system running the TPC-C (Transaction Processing Performance Council) benchmark. The results generated by ADMiRe were confirmed by Oracle experts. We also show that by applying ADMiRe to Microsoft Internet Information Server performance data, we can improve system performance by 20 percent.
Index Terms- Data mining, performance of systems, algorithms for data and knowledge management.
R. Villafane, K. Prabhakara, A. Sawant, K. A. Hua and N. Jiang, "ADMiRe: An Algebraic Data Mining Approach to System Performance Analysis," in IEEE Transactions on Knowledge & Data Engineering, vol. 17, no. , pp. 888-901, 2005.