This Article 
 Bibliographic References 
 Add to: 
HYTREM-A Hybrid Text-Retrieval Machine for Large Databases
January 1990 (vol. 39 no. 1)
pp. 111-123

The design of a text-retrieval machine, called HYTREM (hybrid text-retrieval machine), for the support of large unformatted text databases is described. A signature file is used as an access method to reduce the amount of data that need to be searched directly. Therefore, HYTREM consists of two major subsystems: a signature processor and a text processor. The signature processor is based on a world-parallel, bit-serial organization which is faster, more efficient, and more flexible than a word-serial, bit-parallel organization proposed by S.R. Ahuja and C.S. Roberts (1980). The text processor, called ALTEP (associative linear text processor), is a linear array of logic cells capable of matching regular expressions at a much higher speed than that of previous designs. Since both the signature processor and ALTEP are highly parallel processors, a high-speed multiple-response resolver is provided to facilitate data transfer between the processors and the controllers over a single common bus. Issues about th design of a cost-effective mass-storage system are also discussed. Performance and implementation issues for HYTREM are discussed.

[1] S. R. Ahuja and C. S. Roberts, "An associative/parallel processor for partial match retrieval using superimposed codes," inProc. Seventh Annu. Symp. Comput. Architecture, May 1980, pp. 218-227.
[2] R. M. Bird, J. B. Newsbaum, and J. L. Trefftzs, "Text file inversion: An evaluation," inProc. 4th Workshop Comput. Architecture Non-Numeric Processing, Syracuse, NY, Aug. 1978, pp. 42-50.
[3] H. Boral and D. J. DeWitt, "Database machines: An idea whose time has passed? A critique of the future of database machines," inDatabase Machines, M. Missikoff, Ed. Berlin, Germany: Springer-Verlag, 1983, pp. 166-187.
[4] J. C. Browne, A. G. Dale, C. Leung, and R. Jenevein, "A parallel multi-stage I/O architecture with self-managing disk cache for database management applications," inDatabase Machines Fourth Int. Workshop, H. Boral, Ed. Berlin, Germany: Springer-Verlag, 1985, pp. 331-346.
[5] T. C. Chen and H. Chang, "Magnetic bubble memory and logic," inAdvances in Computers, Vol. 17, M. C. Yovits, Ed. New York: Academic, 1978, pp. 223-282.
[6] S. Christodoulakis and C. Faloutsos, "Design considerations for a message file server,"IEEE Trans. Software Eng., vol. SE-10, no. 2, pp. 201-210, Mar. 1984.
[7] D. J. DeWittet al., "Implementation techniques for main memory databases," inProc. ACM Sigmod(Boston, MA), June 18-21, 1984, pp. 1-8.
[8] C. Faloutsos, "Access methods for text,"ACM Comput. Surveys, vol. 17, pp. 49-74, Mar. 1985.
[9] C. Faloutsos and S. Christodoulakis, "Signature files: An access method for documents and its analytical performance evaluation,"ACM Trans. Office Inform. Syst., vol. 2, Oct. 1984.
[10] C. M. Gravina, "National Westminster Bank mass storage archiving,"IBM Syst. J., vol. 17, no. 4, pp. 344-358, 1978.
[11] L. A. Hollaaret al., "Architecture and operation of a large, fulltext information-retrieval system," inAdvanced Database Machine Architecture, D. K. Hsiao, Ed. Englewood Cliffs, NJ: Prentice-Hall, 1983, pp. 256-299.
[12] J.E. Hopcroft and J.D. Ullman,Introduction to Automata Theory, Languages, and Computation, Addison-Wesley, Reading, Mass., 1979.
[13] M. Jayasooriah and R. M. Colomb, "Attached index machine: A new form of processor and its application to partial match data retrieval," inProc. 9th Australian Comput. Sci. Conf., Sydney, Jan. 1986, pp. 347-355.
[14] R. Katz, J. Ousterhout, D. Patterson, and M. Stonebraker, "A project on high performance I/O subsystems,"Database Eng., vol. 11, no. 1, pp. 40-47, Mar. 1988.
[15] D. E. Knuth,The Art of Computer Programming, Vol. 3, Reading, MA: Addison-Wesley, 1973.
[16] D. L. Lee, "A distributed multiple-response resolver for value-ordered retrieval," inProc. 12th Int. Symp. Comput. Architecture, Boston, MA, June 1985, pp. 258-265.
[17] D. L. Lee, "A word-parallel, bit-serial signature processor for superimposed coding," inProc. 2nd Int. Conf. Data Eng., Los Angeles, CA, Feb. 1986, pp. 352-359.
[18] D. Lee, "Altep--A Cellular Processor for High-Speed Pattern Matching,"New Generation Computing, Vol. 4, Sept. 1986, pp. 225-244.
[19] D. L. Lee, "A word-parallel, bit-serial signature processor and its implementation with magnetic bubbles," OSU-CISRC-5/87-TR14, Dep. Comput. Inform. Sci., Ohio State Univ., Columbus, OH, June 1987.
[20] D. L. Lee and F. H. Lochovsky, "Text retrieval machines," inOffice Automation, D. C. Tsichritzis. New York: Springer-Verlag, 1985, pp. 339-375.
[21] D. L. Lee, "The design and evaluation of a text retrieval machine for large databases," Ph.D. dissertation, Dep. Comput. Sci., Univ. of Toronto, Toronto, Ont., Canada, Sept. 1985 (also published as Tech. Rep. CSRI-172, Comput. Syst. Res. Instit., Univ. of Toronto).
[22] G. Z. Qadah and K. B. Irani, "A database machine for very large relational databases," inProc. Int. Conf. Parallel Processing, Bellaire, MI, Aug. 1983, pp. 307-314.
[23] C. V. Ramamoorthy, J. C. Turner, and B. W. Wah, "A design of a cellular associative memory for ordered retrieval,"IEEE Trans. Comput., vol. C-27, no. 9, pp. 800-815, Sept. 1978.
[24] D. C. Roberts, "A specialized computer architecture for text retrieval," inProc. 4th Workship Comput. Architecture Non-Numeric Processing, Syracuse, NY, Aug. 1978, pp. 51-59.
[25] R. Sacks-Davis and K. Ramamohanarao, "A two level superimposed coding scheme for partial match retrieval,"Inform. Syst., vol. 8, no. 4, pp. 273-280, 1983.
[26] A. J. Smith, "On the effectiveness of buffered and multiple arm disks," inProc. 5th Annual Symp. Comput. Architecture, Palo Alto, CA, Apr. 1978, pp. 242-248.
[27] W. H. Stellhorn, "An inverted file processor for information retrieval,"IEEE Trans. Comput., vol. C-26, pp. 1258-1267, Dec. 1977.
[28] D. M. Taub, "Arbitration and control acquisition in the proposed IEEE 896 Futurebus,"IEEE Micro, vol. 4, no. 4, pp. 28-41, Aug. 1984.
[29] J. C. Wu and F. B. Humphrey, "Computer simulation of magnetic bubble logic devices,"J. Appl. Phys., vol. 55, no. 6, pp. 2581-2583, Mar. 1984.

Index Terms:
HYTREM; hybrid text-retrieval machine; large databases; large unformatted text databases; signature file; signature processor; text processor; bit-parallel organization; text processor; ALTEP; associative linear text processor; logic cells; regular expressions; multiple-response resolver; data transfer; controllers; single common bus; mass-storage system; database management systems; special purpose computers.
D.L. Lee, F.H. Lochovsky, "HYTREM-A Hybrid Text-Retrieval Machine for Large Databases," IEEE Transactions on Computers, vol. 39, no. 1, pp. 111-123, Jan. 1990, doi:10.1109/12.46285
Usage of this product signifies your acceptance of the Terms of Use.