This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Distributed Garbage Collection Algorithms for Timestamped Data
October 2006 (vol. 17 no. 10)
pp. 1057-1071

Abstract—There is an important class of interactive multimedia applications that deals with stream data from distributed sources. Indexing the data temporally facilitates ordering individual streams as well as correlating items from different streams. The Stampede programming system organizes stream data into channels that are distributed and synchronized data structures that contain timestamped items. A Stampede program is a data flow graph of threads and channels. Stampede semantics for channels allow concurrent access from multiple threads for input and output. While a channel holds timestamped items, the semantics do not place any restriction on either the production or consumption order of these items. Furthermore, timestamps of items in a channel need not be contiguous. These flexibilities are required due to the dynamic and parallel structure of stream-oriented applications targeted by the Stampede system. Under such circumstances, a key issue is the "garbage collection” (GC) of channel items. In this paper, we present and compare three different GC algorithms: 1) REF is a simple algorithm that keeps a reference count on individual items; 2) TGC is a distributed algorithm for computing a global low watermark for timestamp values of interest in the entire application; 3) DGC is another distributed algorithm that uses information about the dependencies between the producers and consumers of data streams to compute a low water mark local to each node of the data flow graph. DGC can simultaneously eliminate garbage from channels and unneeded computations from threads. In tests performed using an interactive application, DGC enjoys nearly 30 percent reduction in the application memory footprint compared to TGC and REF. DGC and REF are also shown to be more scalable compared to TGC.

[1] U. Ramachandran, R.S. Nikhil, N. Harel, J.M. Rehg, and K. Knobe, “Space-Time Memory: A Parallel Programming Abstraction for Interactive Multimedia Applications,” Proc. Principles and Practice of Parallel Programming Conf. (PPoPP '99), May 1999.
[2] R.S. Nikhil, U. Ramachandran, J.M. Rehg, R.H. Halstead Jr., C.F. Joerg, and L. Kontothanassis, “Stampede: A Programming System for Emerging Scalable Interactive Multimedia Applications,” Proc. 11th Int'l Workshop Languages and Compilers for Parallel Computing (LCPC '98), Aug. 1998.
[3] U. Ramachandran, R. Nikhil, J.M. Rehg, Y. Angelov, S. Adhikari, K. Mackenzie, N. Harel, and K. Knobe, “Stampede: A Cluster Programming Middleware for Interactive Stream-Oriented Applications,” IEEE Trans. Parallel and Distributed Systems, 2003.
[4] R.S. Nikhil and U. Ramachandran, “Garbage Collection of Timestamped Data in Stampede,” Proc. 19th Ann. Symp. Principles of Distributed Computing (PODC 2000), July 2000.
[5] N. Harel, H.A. Mandviwala, K. Knobe, and U. Ramachandran, “Dead Timestamp Identification in Stampede,” Proc. 2002 Int'l Conf. Parallel Processing (ICPP '02), Aug. 2002.
[6] P.R. Wilson, “Uniprocessor Garbage Collection Techniques,” Proc. Int'l Workshop Memory Management (IWMM '92), pp. 1-42, Sept. 1992.
[7] R. Jones and R. Lins, Garbage Collection: Algorithms for Automatic Dynamic Memory Management. John Wiley, Aug. 1996.
[8] R.M. Fujimoto, “Parallel Discrete Event Simulation,” Comm. ACM, vol. 33, no. 10, Oct. 1990.
[9] R.M. Fujimoto, “Parallel and Distributed Simulation,” Proc. Winter Simulation Conf., pp. 118-125, Dec. 1995.
[10] R.E. Bryant, “Simulation of Packet Communication Architecture Computer Systems,” Technical Report MIT-LCS-TR-188, Cambridge, Mass.: Mass. Inst. of Tech nology, 1977.
[11] K. Chandy and J. Misra, “Asynchronous Distributed Simulation via a Sequence of Parallel Computation,” Comm. ACM, vol. 24, pp. 198-206, 1981.
[12] D.R. Jefferson, “Virtual Time,” ACM Trans. Programming Languages and Systems, vol. 7, no. 3, pp. 404-425, July 1985.
[13] D. Gelernter, “Generative Communication in Linda,” ACM Trans. Programming Languages and Systems, vol. 7, no. 1, pp. 80-112, 1985.
[14] N. Carriero and D. Gelernter, “A Computational Model of Everything,” Comm. ACM, vol. 44, no. 11, pp. 77-81, Nov. 2001.
[15] Sun Microsystems, JavaSpaces Service Specifications, version 1.2.1, Palo Alto, Calif.: Sun Microsystems, Apr. 2002, http://www.sun.com/software/jini/specsjs1_2_1.pdf .
[16] T.J. Lehman, S.W. McLaughry, and P. Wyckoff, “Tspaces: The Next Wave,” Proc. Hawaii Int'l Conf. System Sciences (HICSS-32), Jan. 1999.
[17] H. Xi, “Dead Code Elimination through Dependent Types,” Proc. First Int'l Workshop Practical Aspects of Declarative Languages (PADL '99), pp. 228-242, 1998.
[18] R.S. Nikhil and D. Panariti, “CLF: A Common Cluster Language Framework for Parallel Cluster-Based Programming Languages,” technical report (forthcoming), Digital Equipment Corp., Cambridge Research Laboratory, 1998.
[19] H.A. Mandviwala, N. Harel, K. Knobe, and U. Ramachandran, “A Comparative Study of Stampede Garbage Collection Algorithms,” Proc. 15th Workshop Languages and Compilers for Parallel Computing, July 2002.
[20] J.M. Rehg, M. Loughlin, and K. Waters, “Vision for a Smart Kiosk,” Proc. Conf. Computer Vision and Pattern Recognition, pp. 690-696, June 1997.

Index Terms:
Garbage collection, distributed programming, logical timestamps, virtual time, soft real-time systems, performance evaluation, cluster computing, multimedia systems, ubiquitous computing.
Citation:
Umakishore Ramachandran, Kathleen Knobe, Nissim Harel, Hasnain A. Mandviwala, "Distributed Garbage Collection Algorithms for Timestamped Data," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 10, pp. 1057-1071, Oct. 2006, doi:10.1109/TPDS.2006.138
Usage of this product signifies your acceptance of the Terms of Use.