Issue No. 06 - June (2012 vol. 23)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2011.257
Zhen Jia , Chinese Academy of Sciences, Beijing
Lei Wang , Chinese Academy of Sciences, Beijing
Jianfeng Zhan , Chinese Academy of Sciences, Beijing
Bo Sang , Purdue University, West Lafayette
Haining Wang , College of William and Mary, Williamsburg
Zhihong Zhang , Chinese Academy of Sciences, Beijing
Gang Lu , Chinese Academy of Sciences, Beijing
Dongyan Xu , Purdue University, West Lafayette
As more and more multitier services are developed from commercial off-the-shelf components or heterogeneous middleware without source code available, both developers and administrators need a request tracing tool to 1) exactly know how a user request of interest travels through services of black boxes and 2) obtain macrolevel user request behaviors of services without manually analyzing massive logs. This need is further exacerbated by IT system “agility,” which mandates the tracing tool to provide online performance data since offline approaches cannot reflect system changes in real time. Moreover, considering the large scale of deployed services, a pragmatic tracing approach should be scalable in terms of the cost in collecting and analyzing logs. In this paper, we introduce a precise, scalable, and online request tracing tool for multitier services of black boxes. Our contributions are threefold. First, we propose a precise request tracing algorithm for multitier services of black boxes, which only uses application-independent knowledge. Second, we present a microlevel abstraction, component activity graph, to represent causal paths of each request. On the basis of this abstraction, we use dominated causal path patterns to represent repeatedly executed causal paths that account for significant fractions, and we further present a derived performance metric of causal path patterns, latency percentages of components, to enable debugging performance-in-the-large. Third, we develop two mechanisms, tracing on demand and sampling, to significantly increase the system scalability. We implement a prototype of the proposed system, called PreciseTracer, and release it as open source code. In comparison with WAP5—a black-box tracing approach, PreciseTracer achieves higher tracing accuracy and faster response time. Our experimental results also show that PreciseTracer has low overhead, and still achieves high tracing accuracy even if an aggressive sampling policy is adopted, indicating that PreciseTracer is a promising tracing tool for large-scale production systems.
Multitier service, black boxes, precise request tracing, micro- and macrolevel abstractions, online analysis, performance debugging, scalability.
Zhen Jia, Lei Wang, Jianfeng Zhan, Bo Sang, Haining Wang, Zhihong Zhang, Gang Lu, Dongyan Xu, "Precise, Scalable, and Online Request Tracing for Multitier Services of Black Boxes", IEEE Transactions on Parallel & Distributed Systems, vol. 23, no. , pp. 1159-1167, June 2012, doi:10.1109/TPDS.2011.257