This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Query Languages for Sequence Databases: Termination and Complexity
May/June 2001 (vol. 13 no. 3)
pp. 519-525

Abstract—This paper develops a query language for sequence databases, such as genome databases and text databases. Unlike relational data, queries over sequential data can easily produce infinite answer sets since the universe of sequences is infinite, even for a finite alphabet. The challenge is to develop query languages that are both highly expressive and finite. This paper develops such a language as a subset of a logic for string databases called Sequence Datalog. The main idea is to use safe recursion to control and limit unsafe recursion. The main results are the definition of a finite form of recursion, called domain-bounded recursion, and a characterization of its complexity and expressive power. Although finite, the resulting class of programs is highly expressive since its data complexity is complete for the elementary functions.

[1] K. Apt, H.A. Blair, and A. Walker, "Towards a Theory of Declarative Knowledge," Foundations of Deductive Databases and Logic Programming, J. Minker, ed., pp. 89-148. Morgan Kaufmann, 1988.
[2] F. Bancilhon and R. Ramakrishnan,“An amateur’s introduction to recursive query processing strategies,” Proc. 1986 ACM-SIGMOD Int’l Conf. Management Data, pp. 16-52,Washington, DC, May 1986.
[3] A.J. Bonner and G. Mecca, “Sequences, Datalog, and Transducers,” J. Computing and System Sciences, special issue on Principles of Database Systems PODS '95, vol. 57, no. 3, pp. 234-259, 1998.
[4] L.S. Colby, E.L. Robertson, L.V. Saxton, and D. Van Gucht, “A Query Language for List-Based Complex Objects,” Proc.13th ACM SIGMOD Int'l Symp. Principles of Database Systems (PODS '94), pp. 179-189, 1994.
[5] S. Ginsburg and X. Wang, “Pattern Matching by RS-Operations: Towards a Unified Approach to Querying Sequence Data,” Proc. 11th ACM SIGACT SIGMOD SIGART Symp. Principles of Database Systems (PODS '92), pp. 293-300, 1992.
[6] G. Grahne, M. Nykanen, and E. Ukkonen, “Reasoning about Strings in Databases,” Proc. 13th ACM SIGMOD Int'l Symp. Principles of Database Systems (PODS '94), pp. 303-312, 1994.
[7] G. Grahne and E. Waller, “How to Make SQL Stand for String Query Language?” Proc. Seventh Int'l Workshop Database Programming Languages (DBPL '99), 1999.
[8] S. Grumbach and T. Milo, “An Algebra for POMSETS.” Proc. Fifth Int'l Conf. Data Base Theory (ICDT '95), pp. 191-207, 1995.
[9] J.W. Lloyd, Foundations of Logic Programming, Springer Series in Symbolic Computation, second ed. New York: Springer-Verlag, 1987.
[10] C.H. Papadimitriou, Computational Complexity. Addison-Wesley, 1994.
[11] K. Sohn and A. Van Gelder, “Termination Detection in Logic Programs Using Argument Sizes,” Proc. 10th ACM SIGACT SIGMOD SIGART Symp. Principles of Database Systems (PODS '91), pp. 216-226, 1991.

Index Terms:
Sequence databases, deductive databases, query languages, Datalog, termination, complexity.
Citation:
Giansalvatore Mecca, Anthony J. Bonner, "Query Languages for Sequence Databases: Termination and Complexity," IEEE Transactions on Knowledge and Data Engineering, vol. 13, no. 3, pp. 519-525, May-June 2001, doi:10.1109/69.929906
Usage of this product signifies your acceptance of the Terms of Use.