Issue No. 10 - October (2006 vol. 17)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TPDS.2006.138
Umakishore Ramachandran , IEEE
<p><b>Abstract</b>—There is an important class of interactive multimedia applications that deals with stream data from distributed sources. Indexing the data temporally facilitates ordering individual streams as well as correlating items from different streams. The <it>Stampede</it> programming system organizes stream data into <it>channels</it> that are distributed and synchronized data structures that contain timestamped items. A Stampede program is a data flow graph of threads and channels. Stampede semantics for channels allow concurrent access from multiple threads for input and output. While a channel holds timestamped items, the semantics do not place any restriction on either the production or consumption order of these items. Furthermore, timestamps of items in a channel need not be contiguous. These flexibilities are required due to the dynamic and parallel structure of stream-oriented applications targeted by the Stampede system. Under such circumstances, a key issue is the "garbage collection” (GC) of channel items. In this paper, we present and compare three different GC algorithms: 1) REF is a simple algorithm that keeps a reference count on individual items; 2) TGC is a distributed algorithm for computing a <it>global</it> low watermark for timestamp values of interest in the entire application; 3) DGC is another distributed algorithm that uses information about the dependencies between the producers and consumers of data streams to compute a low water mark <it>local</it> to each node of the data flow graph. DGC can simultaneously eliminate garbage from channels and unneeded computations from threads. In tests performed using an interactive application, DGC enjoys nearly 30 percent reduction in the application memory footprint compared to TGC and REF. DGC and REF are also shown to be more scalable compared to TGC.</p>
Garbage collection, distributed programming, logical timestamps, virtual time, soft real-time systems, performance evaluation, cluster computing, multimedia systems, ubiquitous computing.
Hasnain A. Mandviwala, Nissim Harel, Kathleen Knobe, Umakishore Ramachandran, "Distributed Garbage Collection Algorithms for Timestamped Data", IEEE Transactions on Parallel & Distributed Systems, vol. 17, no. , pp. 1057-1071, October 2006, doi:10.1109/TPDS.2006.138