Reliable Distributed Systems, IEEE Symposium on (2009)
Niagara Falls, New York
Sept. 27, 2009 to Sept. 30, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/SRDS.2009.37
Event Stream Processing (ESP) systems are very popular in monitoring applications. Algorithmic trading, network monitoring and sensor networks are good examples of applications that rely upon ESP systems. As these systems become larger and more widely deployed, they have to answer increasingly stronger requirements that are often difficult to satisfy. Fault-tolerance is a good example of such a non-trivial requirement. Making ESP operators fault-tolerant can add considerable performance overhead to the application. In this paper, we focus on active replication as an approach to provide fault-tolerance to ESP operators. More precisely, we address the performance costs of active replication for operators in distributed ESP applications.We use a speculation mechanism based on Software Transactional Memory (STM) to achieve the following goals: (i) enable replicas to make progress using optimistic delivery; (ii) enable early forwarding of speculative computation results; (iii) enable active replication of multi-threaded operators using transactional executions. Experimental evaluation shows that, using this combination of mechanisms, one can implement highly efficient fault-tolerant ESP operators.
active replication, event processing, fault-tolerance, distributed systems, speculation, parallel computing
Andrey Brito, Christof Fetzer, Pascal Felber, "Multithreading-Enabled Active Replication for Event Stream Processing Operators", Reliable Distributed Systems, IEEE Symposium on, vol. 00, no. , pp. 22-31, 2009, doi:10.1109/SRDS.2009.37