2010 IEEE 26th International Conference on Data Engineering (ICDE 2010) (2010)
Long Beach, CA, USA
Mar. 1, 2010 to Mar. 6, 2010
Marcos Antonio Vaz Salles , Cornell University, USA
Jens Dittrich , Saarland University, Germany
Lukas Blunschi , ETH Zurich, Switzerland
Dataspace applications necessitate the creation of associations among data items over time. For example, once information about people is extracted from sources on the Web, associations among them may emerge as a consequence of different criteria, such as their city of origin or their elected hobbies. In this paper, we advocate a declarative approach to specifying these associations. We propose that each set of associations be defined by an association trail. An association trail is a query-based definition of how items are connected by intensional (i.e., virtual) association edges to other items in the dataspace. We study the problem of processing neighborhood queries over such intensional association graphs. The naive approach to neighborhood query processing over intensional graphs is to materialize the whole graph and then apply previous work on dataspace graph indexing to answer queries. We present in this paper a novel indexing technique, the grouping-compressed index (GCI), that has better worst-case indexing cost than the naive approach. In our experiments, GCI is shown to provide an order of magnitude gain in indexing cost over the naive approach, while remaining competitive in query processing time.
J. Dittrich, L. Blunschi and M. A. Vaz Salles, "Intensional associations in dataspaces," 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010)(ICDE), Long Beach, CA, USA, 2010, pp. 984-987.