June 25, 2007 to June 27, 2007
Alex Delis , University of Athens, Greece
Vassil Kriakov , Polytechnic University Brooklyn, NY 11201
The emergence of applications producing continuous high-frequency data streams has brought forth a large body of research in the area of distributed stream processing. In presence of high volumes of data, efforts have primarily concentrated on providing approximate aggregate or top-k type results. Scalable solutions for providing answers to window join queries in distributed stream processing systems have received limited attention to date. We provide a solution for the window join in a distributed stream processing system which features reduced inter-node communications achieved through automatic throughput handling based on resource availability. Our approach is based on incrementally updated discrete Fourier transforms (DFTs). Furthermore, we provide formulae for computingDFT compression factors in order to achieve information reduction. We perform WAN-based prototype experiments to ascertain the viability and establish the effectiveness of our method. Our experimental results reveal that our method scales in terms of throughput and error rates, achieving sub-linear message complexity in domains that exhibit a geographic skew in the joining attributes.
Alex Delis, Vassil Kriakov, "Approximate Data Stream Joins in Distributed Systems", ICDCS, 2007, 27th International Conference on Distributed Computing Systems (ICDCS '07), 27th International Conference on Distributed Computing Systems (ICDCS '07) 2007, pp. 5, doi:10.1109/ICDCS.2007.104