$S$ of objects such that, given an interval $I$ , a query counts how many objects of $S$ are covered by $I$ . Besides COUNT, the problem can also be defined with other aggregate functions, e.g., SUM, MIN, MAX and AVERAGE. This paper studies a novel variant of range aggregation, where an object can belong to multiple sets. A query (at runtime) picks any two sets, and aggregates on their intersection. More formally, let $S_{1},\ldots, S_{m}$ be $m$ sets of objects. Given distinct set ids $i$ , $j$ and an interval $I$ , a query reports how many objects in $S_{i}\mathop{\rm\cap\kern 0pt}\displaylimits S_{j}$ are covered by $I$ . We call this problem range aggregation with set selection (RASS). Its hardness lies in that the pair $(i, j)$ can have ${m\choose 2}$ choices, rendering effective indexing a non-trivial task. The RASS problem can also be defined with other aggregate functions, and generalized so that a query chooses more than 2 sets. We develop a system called RASS to power this type of queries. Our system has excellent efficiency in both theory and practice. Theoretically, it consumes linear space, and achieves nearly-optimal query time. Practically, it outperforms existing solutions on real datasets by a factor up to an order of magnitude. The paper also features a rigorous theoretical analysis on the hardness of the RASS problem, which reveals invaluable insight into its characteristics." /> $S$ of objects such that, given an interval $I$ , a query counts how many objects of $S$ are covered by $I$ . Besides COUNT, the problem can also be defined with other aggregate functions, e.g., SUM, MIN, MAX and AVERAGE. This paper studies a novel variant of range aggregation, where an object can belong to multiple sets. A query (at runtime) picks any two sets, and aggregates on their intersection. More formally, let $S_{1},\ldots, S_{m}$ be $m$ sets of objects. Given distinct set ids $i$ , $j$ and an interval $I$ , a query reports how many objects in $S_{i}\mathop{\rm\cap\kern 0pt}\displaylimits S_{j}$ are covered by $I$ . We call this problem range aggregation with set selection (RASS). Its hardness lies in that the pair $(i, j)$ can have ${m\choose 2}$ choices, rendering effective indexing a non-trivial task. The RASS problem can also be defined with other aggregate functions, and generalized so that a query chooses more than 2 sets. We develop a system called RASS to power this type of queries. Our system has excellent efficiency in both theory and practice. Theoretically, it consumes linear space, and achieves nearly-optimal query time. Practically, it outperforms existing solutions on real datasets by a factor up to an order of magnitude. The paper also features a rigorous theoretical analysis on the hardness of the RASS problem, which reveals invaluable insight into its characteristics." /> $S$ of objects such that, given an interval $I$ , a query counts how many objects of $S$ are covered by $I$ . Besides COUNT, the problem can also be defined with other aggregate functions, e.g., SUM, MIN, MAX and AVERAGE. This paper studies a novel variant of range aggregation, where an object can belong to multiple sets. A query (at runtime) picks any two sets, and aggregates on their intersection. More formally, let $S_{1},\ldots, S_{m}$ be $m$ sets of objects. Given distinct set ids $i$ , $j$ and an interval $I$ , a query reports how many objects in $S_{i}\mathop{\rm\cap\kern 0pt}\displaylimits S_{j}$ are covered by $I$ . We call this problem range aggregation with set selection (RASS). Its hardness lies in that the pair $(i, j)$ can have ${m\choose 2}$ choices, rendering effective indexing a non-trivial task. The RASS problem can also be defined with other aggregate functions, and generalized so that a query chooses more than 2 sets. We develop a system called RASS to power this type of queries. Our system has excellent efficiency in both theory and practice. Theoretically, it consumes linear space, and achieves nearly-optimal query time. Practically, it outperforms existing solutions on real datasets by a factor up to an order of magnitude. The paper also features a rigorous theoretical analysis on the hardness of the RASS problem, which reveals invaluable insight into its characteristics." /> Range Aggregation With Set Selection
Subscribe
Issue No.05 - May (2014 vol.26)
pp: 1
Yufei Tao , , Chinese University of Hong Kong, Hong Kong
Cheng Sheng , , Google, Zürich, Switzerland
Chin-Wan Chung , , Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
Jong-Ryul Lee , , Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea
ABSTRACT
In the classic range aggregation problem, we have a set $S$ of objects such that, given an interval $I$ , a query counts how many objects of $S$ are covered by $I$ . Besides COUNT, the problem can also be defined with other aggregate functions, e.g., SUM, MIN, MAX and AVERAGE. This paper studies a novel variant of range aggregation, where an object can belong to multiple sets. A query (at runtime) picks any two sets, and aggregates on their intersection. More formally, let $S_{1},\ldots, S_{m}$ be $m$ sets of objects. Given distinct set ids $i$ , $j$ and an interval $I$ , a query reports how many objects in $S_{i}\mathop{\rm\cap\kern 0pt}\displaylimits S_{j}$ are covered by $I$ . We call this problem range aggregation with set selection (RASS). Its hardness lies in that the pair $(i, j)$ can have ${m\choose 2}$ choices, rendering effective indexing a non-trivial task. The RASS problem can also be defined with other aggregate functions, and generalized so that a query chooses more than 2 sets. We develop a system called RASS to power this type of queries. Our system has excellent efficiency in both theory and practice. Theoretically, it consumes linear space, and achieves nearly-optimal query time. Practically, it outperforms existing solutions on real datasets by a factor up to an order of magnitude. The paper also features a rigorous theoretical analysis on the hardness of the RASS problem, which reveals invaluable insight into its characteristics.
INDEX TERMS
Silicon, Aggregates, Arrays, Facebook, Aging, Indexing,Theory, Range Aggregation, Index
CITATION
Yufei Tao, Cheng Sheng, Chin-Wan Chung, Jong-Ryul Lee, "Range Aggregation With Set Selection", IEEE Transactions on Knowledge & Data Engineering, vol.26, no. 5, pp. 1, May 2014, doi:10.1109/TKDE.2013.125