This Article 
 Bibliographic References 
 Add to: 
Database Management with Sequence Trees and Tokens
January-February 1997 (vol. 9 no. 1)
pp. 186-192

Abstract—An approach to organizing storage in database systems is presented that, under a wide range of conditions, saves both storage space and processing time. Text values in a database are replaced by short, fixed-length, rank-preserving numeric tokens. The actual values are stored in separate, nonredundant storage. Database operations that depend only on the relative magnitude of data values can be performed directly on the tokens. Tokenization is shown to improve database performance most in situations where there are a lot of ad hoc queries and a low volume of database insertions relative to other operations.

[1] D.S. Batory, "Index Coding: A Compression Technique for Large Statistical Databases," Proc. Third Int'l Workshop Statistical Data Management, 1983.
[2] D. Comer, “The Ubiquitous B-Tree,” ACM Computing Surveys, vol. 11, no. 2, pp. 121-137, June 1979.
[3] C.J. Date,An Introduction to Database Systems, 2nd ed., Addison-Wesley, Reading, Mass., 1977.
[4] R.C. Goldstein and A.D. Strnad, "The MacAIMS Data Management System," Proc. ACM SIGFIDET Workshop on Data Description and Access,Houston, Tex., 1970.
[5] G. Held, Data Compression, third edition. New York: John Wiley and Sons, 1991.
[6] D.E. Knuth, The Art of Computer Programming. Addison-Wesley, 1973.
[7] B. Liskob and S. Zilles, "Programming with Abstract Data Types," ACM SigPlan Notices, vol. 9, no. 4, pp. 50-59, 1974.
[8] H. Lorin, Sorting and Sort Systems.Reading, Mass.: Addison-Wesley, 1975.
[9] J. Ong, D. Fogg, and M. Stonebraker, "Implementation of Data Abstraction in the Relational Database System INGRES," ACM SIGMOD Record, no. 2, pp. 1-14, Spring 1984.
[10] S. Osborn and IT. Heaven, "The Design of a Relational Database System with Abstract Data types for Domains," ACM Trans. Database Systems, vol. 11, no. 3, pp. 357-373, 1986.
[11] A. Shoshani, "Statistical Databases: Characteristics, Problems, and Some Solutions," Trans. Very Large Data Bases, pp. 208-221, 1982.
[12] J.A. Storere, Data Compression Methods and Theory.Rockville, Md.: Computer Science Press, 1988.
[13] C. Wagner and R.C. Goldstein, "EBASE—A Database System for Research and Teaching," Univ. of British Columbia Faculty of Commerce Working Paper No. 1151, Nov. 1985.

Index Terms:
Abstract data types, database management, design, file organization, performance, tokenization.
Robert C. Goldstein, Christian Wagner, "Database Management with Sequence Trees and Tokens," IEEE Transactions on Knowledge and Data Engineering, vol. 9, no. 1, pp. 186-192, Jan.-Feb. 1997, doi:10.1109/69.567062
Usage of this product signifies your acceptance of the Terms of Use.