The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.12 - Dec. (2012 vol.23)
pp: 2315-2329
Shicong Meng , Georgia Institute of Technology, Atlanta
Chitra Venkatramani , IBM T.J. Watson Research Center, Hawthorne
Ling Liu , Georgia Institute of Technology, Atlanta
ABSTRACT
The increasing popularity of large-scale distributed applications in datacenters has led to the growing demand of distributed application state monitoring. These application state monitoring tasks often involve collecting values of various status attributes from a large number of nodes. One challenge in such large-scale application state monitoring is to organize nodes into a monitoring overlay that achieves monitoring scalability and cost effectiveness at the same time. In this paper, we present REMO, a REsource-aware application state MOnitoring system, to address the challenge of monitoring overlay construction. REMO distinguishes itself from existing works in several key aspects. First, it jointly considers intertask cost-sharing opportunities and node-level resource constraints. Furthermore, it explicitly models the per-message processing overhead which can be substantial but is often ignored by previous works. Second, REMO produces a forest of optimized monitoring trees through iterations of two phases. One phase explores cost-sharing opportunities between tasks, and the other refines the tree with resource-sensitive construction schemes. Finally, REMO also employs an adaptive algorithm that balances the benefits and costs of overlay adaptation. This is particularly useful for large systems with constantly changing monitoring tasks. Moreover, we enhance REMO in terms of both performance and applicability with a series of optimization and extension techniques. We perform extensive experiments including deploying REMO on a BlueGene/P rack running IBM's large-scale distributed streaming system—System S. Using REMO in the context of collecting over 200 monitoring tasks for an application deployed across 200 nodes results in a 35-45 percent decrease in the percentage error of collected attributes compared to existing schemes.
INDEX TERMS
Monitoring, Resource management, Vegetation, Database systems, Optimization, Scalability, data-intensive, Resource-aware, state monitoring, distributed monitoring, datacenter monitoring, adaptation
CITATION
Shicong Meng, Srinivas Raghav Kashyap, Chitra Venkatramani, Ling Liu, "Resource-Aware Application State Monitoring", IEEE Transactions on Parallel & Distributed Systems, vol.23, no. 12, pp. 2315-2329, Dec. 2012, doi:10.1109/TPDS.2012.82
REFERENCES
[1] N. Jain, L. Amini, H. Andrade, R. King, Y. Park, P. Selo, and C. Venkatramani, "Design, Implementation, and Evaluation of the Linear Road Bnchmark on the Stream Processing Core," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD), 2006.
[2] B. Hayes, "Cloud Computing," Comm. ACM, vol. 51, no. 7, pp. 9-11, 2008.
[3] L. Amini, N. Jain, A. Sehgal, J. Silber, and O. Verscheure, "Adaptive Control of Extreme-Scale Stream Processing Systems," Proc. IEEE 26th Int'l Conf. Distributed Computing Systems (ICDCS), 2006.
[4] J. Borkowski, D. Kopanski, and M. Tudruj, "Parallel Irregular Computations Control Based on Global Predicate Monitoring," Proc. Int'l Symp. Parallel Computing in Electrical Eng. (PARELEC), 2006.
[5] K. Park and V.S. Pai, "Comon: A Mostly-Scalable Monitoring System for Planetlab," Operating Systems Rev., vol. 40, no. 1, pp. 65-74, 2006.
[6] S. Madden, M.J. Franklin, J.M. Hellerstein, and W. Hong, "Tag: A Tiny Aggregation Service for Ad-Hoc Sensor Networks," Proc. Fifth Symp. Operating Systems Design and Implementation (OSDI), 2002.
[7] P. Yalagandula and M. Dahlin, "A Scalable Distributed Information Management System," Proc. SIGCOMM, pp. 379-390, 2004.
[8] R. Huebsch, B.N. Chun, J.M. Hellerstein, B.T. Loo, P. Maniatis, T. Roscoe, S. Shenker, I. Stoica, and A.R. Yumerefendi, "The Architecture of Pier: An Internet-Scale Query Processor," Proc. Second Conf. Innovative Data Systems Research (CIDR), 2005.
[9] G. Cormode and M.N. Garofalakis, "Sketching Streams through the Net: Distributed Approximate Query Tracking," Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), pp. 13-24, 2005.
[10] D.J. Abadi, S. Madden, and W. Lindner, "Reed: Robust, Efficient Filtering, and Event Detection in Sensor Networks," Proc. 31st Int'l Conf. Very Large Databases (VLDB), 2005.
[11] U. Srivastava, K. Munagala, and J. Widom, "Operator Placement for In-Network Stream Query Processing," Proc. ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems (PODS), pp. 250-258, 2005.
[12] C. Olston, B.T. Loo, and J. Widom, "Adaptive Precision Setting for Cached Approximate Values," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '01), 2001.
[13] S. Krishnamurthy, C. Wu, and M.J. Franklin, "On-the-Fly Sharing for Streamed Aggregation," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD '06), pp. 623-634, 2006.
[14] S. Ko and I. Gupta, "Efficient On-Demand Operations in Dynamic Distributed Infrastructures," Proc. Second Workshop Large-Scale Distributed Systems and Middleware (LADIS), 2008.
[15] S. Meng, S.R. Kashyap, C. Venkatramani, and L. Liu, "Remo: Resource-Aware Application State Monitoring for Large-Scale Distributed Systems," Proc. IEEE 29th Int'l Conf. Distributed Computing Systems (ICDCS), pp. 248-255, 2009.
[16] K. Marzullo and M.D. Wood, Tools for Constructing Distributed Reactive Systems. Cornell Univ., 1991.
[17] S.R. Kashyap, D. Turaga, and C. Venkatramani, "Efficient Trees for Continuous Monitoring," 2008.
[18] D.S. Turaga, M. Vlachos, O. Verscheure, S. Parthasarathy, W. Fan, A. Norfleet, and R. Redburn, "Yieldmonitor: Real-Time Monitoring and Predictive Analysis of Chip Manufacturing Data," 2008.
[19] R. Zhang, N. Koudas, B.C. Ooi, and D. Srivastava, "Multiple Aggregations over Data Streams," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD), 2005.
[20] J. Li, D. Maier, K. Tufte, V. Papadimos, and P.A. Tucker, "No Pane, No Gain: Efficient Evaluation of Sliding-Window Aggregates over Data Streams," ACM SIGMOD Record, vol. 34, no. 1, pp. 39-44, 2005.
[21] S. Madden, M.A. Shah, J.M. Hellerstein, and V. Raman, "Continuously Adaptive Continuous Queries over Streams," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD), 2002.
[22] D. Kossmann, "The State of the Art in Distributed Query Processing," ACM Computing Surveys, vol. 32, no. 4, pp. 422-469, 2000.
[23] R. Huebsch, M.N. Garofalakis, J.M. Hellerstein, and I. Stoica, "Sharing Aggregate Computation for Distributed Queries," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD), 2007.
[24] A. Silberstein and J. Yang, "Many-to-Many Aggregation for Sensor Networks," Proc. 27th Int'l Conf. Data Eng. (ICDE), pp. 986-995, 2007.
[25] S. Xiang, H.-B. Lim, K.-L. Tan, and Y. Zhou, "Two-Tier Multiple Query Optimization for Sensor Networks," Proc. 27th Int'l Conf. Distributed Computing Systems (ICDCS), p. 39, 2007.
[26] J. Borkowski, "Hierarchical Detection of Strongly Consistent Global States," Proc. Third Int'l Symp. Parallel and Distributed Computing/Third Int'l Workshop Algorithms, Models, and Tools for Parallel Computing on Heterogeneous Networks (ISPDC/HeteroPar), pp. 256-261, 2004.
[27] A. Silberstein, R. Braynard, and J. Yang, "Constraint Chaining: On Energy-Efficient Continuous Monitoring in Sensor Networks," Proc. ACM SIGMOD Int'l Conf. Management of Data (SIGMOD), 2006.
48 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool