Issue No.04 - October-December (2009 vol.31)
pp: 6-25
Thomas Haigh , University of Wisconsin, Milwaukee
<p>Generalized report generation and file maintenance programs were widely used in the 1950s, standardized by the Share user group with 9PAC and Surge. By the 1960s the first recognizable DBMS systems, such IMS and IDS, had evolved to address the challenges of disk drives and MIS projects. Finally, in the late 1960s Codasyl's Data Base Task Group formulated the DBMS concept itself.</p>
database management system (DBMS), data processing, Codasyl, Share, IMS, IDS, 9PAC, DBTG, Surge, report generation, and file maintenance
Thomas Haigh, "How Data Got its Base: Information Storage Software in the 1950s and 1960s", IEEE Annals of the History of Computing, vol.31, no. 4, pp. 6-25, October-December 2009, doi:10.1109/MAHC.2009.123
1. The development of the mainframe DBMS market is explored in M. Campbell-Kelly, From Airline Reservations to Sonic the Hedgehog: A History of the Software Industry, MIT Press, 2003, pp. 145–149, 184–191. A short history focusing on the role of public funding in the emergence of the relational model is found in National Research Council, Funding A Revolution: Government Support for Computing Research, chapt. 6, Nat'l Academy Press, 1999.
2. Many database textbooks include a few pages on the development of database theory along with their introductory definitions (for example, R. Elmasri and S.B. Navathe, Fundamentals of Database Management Systems, Benjamin/Cummings, 1989, does this well), but this can mean little when stripped of its historical context. The closest thing to a detailed history is a quarter-century old technical primer by J.P. Fry, and E.H. Sibley, "Evolution of Data-Base Management Systems," ACM Computing Surveys, vol. 8, no. 1,1976, pp. 7–42, 19–29. On the technical side, detailed comparisons of early systems are given in C.J. Byrnes, and D.B. Steig, "File Management Systems: A Current Summary," Datamation, vol. 15, no. 11,1969, pp. 138–142; Codasyl Systems Committee, "Feature Analysis of Generalized Data Base Management Systems," ACM Press, 1971; L. Welke, "A Review of File Management Systems," Datamation, vol. 18, no. 10,1972, pp. 52–54; and D.B. Steig, "File Management Systems Revisited," Datamation, vol. 18, no. 10, 1972, pp. 48–51.
3. T. Pinch and R. Swedberg eds., Living in a Material World, MIT Press, 2008.
4. T. Hughes, Networks of Power: Electrification in Western Society, 1880–1930, Johns Hopkins Univ. Press, 1983.
5. Administrative computing during this era is discussed in T. Haigh, "The Chromium-Plated Tabulator: Institutionalizing an Electronic Revolution, 1954–1958," IEEE Annals of the History of Computing, vol. 23, no. 4, 2001, pp. 75–104.
6. R.F. Osborn,, "GE and UNIVAC: Harnessing the High-Speed Computer," Harvard Business Rev., vol. 32, no. 4, 1954, pp. 99–107.
7. A.D. Meacham and V.B. Thompson eds., Total Systems, American Data Processing, 1962, p. 153.
8. By the 1940s, most punched cards included 80 columns of data, each one of which coded a single number or letter. Information within each card was grouped into fields, each occupying a fixed width within each record card. Consider a factory using punched cards to process its payroll. Some fields needed only one column—, for example, sex (M or F). Other fields, such as last name, might be assigned a dozen columns. Each record would be punched onto one, or in some cases several, of the cards in the deck. The complete deck representing all the factory workers was known as a file, by analogy with conventional paper records. Each record card within the file had to follow exactly the same layout of fields, and to process a particular job, the machine operators had to rewire each machine's control panel (such as sorter, collator, or tabulator) to reflect this specific field layout. Many jobs involved "merging" information from several files—for example, combining wage information from the master file of personnel cards with the attendance information punched onto a weekly punched card by an IBM time clock. See J.J. McCaffrey, "From Punched Cards to Personal Computers," John J. Mc Caffrey Memoirs, CBI 47, Charles Babbage Inst., Univ. of Minnesota, 10 June 1989.
9. IBM, "IBM 705 Generalized Sorting Program Sort 51 Bitsavers," 1956; Sorting_Pgm_1956.pdf . The logic behind a total of 17 runs is 15 sorting runs of half tapes plus final and initial runs of a full tape. The program was designed to store a maximum of four records in memory at once, which explains both its dismal performance and the relatively large maximum record size of up to 2,494 characters. A more complex program presented with a smaller maximum record size would have been able to sort short record sequences within core memory, reducing the number of sorting passes required through the tape. Sorting methods of this era are explored in E.H. Friend, "Sorting on Electronic Computer Systems," J. ACM, vol. 3, no. 3, 1956, pp. 134–168.
10. D.D. McCracken, H. Weiss, and T.-h. Lee, Programming Business Computers, John Wiley &Sons, 1959, pp. 178–204.
11. IBM, "IBM Electronic Data-Processing Machines Type 705 Preliminary Manual of Operations (22-6627-4) Bitsavers," 1957; Oper_Jun57.pdf . Other specialized control units hooked up to the same tape drives could transfer data between tapes and cards or print without involving the main computer.
12. For contemporary system documentation, see W.C. McGee, "Generalization: Key To Successful Electronic Data Processing," J. ACM, vol. 6, no. 1,1959, pp. 1–23, and R.C. McGee, and H. Tellier, "A Re-Evaluation of Generalization," Datamation, vol. 6, no. 4, 1960, pp. 25–29.
13. R.C. McGee, My Adventures with Dwarfs: A Personal History in Mainframe Computers, CBI, 2004.
14. "Share Reference Manual for the IBM 704," Share records, CBI 21, 1958.
15. Russell McGee is not to be confused with W.C. McGee who headed the scientific computing group at Hanford during the same period. I quote both in this article.
16. "Verbatim Transcript of the 9th Meeting of Share, October 3, 1957, Morning Session," Share records, CBI 21, 1957, p. 49.
17. This group's contributions included a version of the powerful Autocoder assembler. C.J. Bashe et al., IBM's Early Computers, MIT Press, 1986, pp. 345–347, 355–356.
18. Share is discussed, with particular reference to SOS, in A. Akera, "Voluntarism and the Fruits of Collaboration," Technology and Culture, vol. 42, no. 4, 2001, pp. 710–736.
19. Progress on implementation of the I/O package is discussed in J. King, "Progress Report on 709 Input-Output, SSD 017," Share records, 1955–86, NMAH 567, Archives Center, Nat'l Museum of Am. History (NMAH), Behring Center, Smithsonian Inst., 19 Aug. 1957. SOS eventually included a macro-based input system known as INTRAN and an output system known as OUTRAN as well as support for buffering and transmission of I/O in its memory resident monitor programs. IBM, "SOS Reference Manual (incl Distributions 1 to 5) Bitsavers," 1961; Reference_Manual_Jun61.pdf .
20. C. Mock, "The MockDonald Multiphase System for the 709, SSD 19," Share records, 1955–86, NMAH 567, 9 Sept. 1957.
21. C.W. Bachman, "Report of the Share Data Processing Committee, October 2, SSD 020," Share records, 1955–86, NMAH 567, 1957.
22. This yielded the DP glossary in C.W. Bachman, "Share Data Processing Committee: Selected Glossary, SSD 022," Share records, 1955–86, NMAH 567, 18 Nov. 1957. This glossary included the concept of a key as a unique identifier used to locate a record.
23. A revised version of this material was published in W.C. McGee, "Generalization: Key To Successful Electronic Data Processing," J. ACM, vol. 6, no. 1, 1959, pp. 1–23, which remains an excellent introduction to the practices of this era.
24. W. Orchard-Hays, "Letter to Bill Dobrusky, March 16, SSD 049," Share records, 1955–86, NMAH 567, 1959. In the letter, Orchard-Hays says, "The 704 Data Processing Sort Routine …is now known as SURGE." Surge has rather a low historical profile, and citations in surveys are generally to an undated document, "SURGE: A Data Processing Compiler for the IBM 704," issued by North Am. Aviation. This might be the same document archived as R. Paul, and W. Davenport, "Untitled SURGE Manual for IBM 704," Share records, CBI 21, ~1959.
25. Early objectives of and specification for the 709/7090 Surge project are discussed in B. Went, "Minutes of 709-7090 SURGE Meeting, September 14–16, 1959 - Columbus Ohio, SSD 59," Share records, 1955–86, NMAH 567, 1959, and E. Austin, "Minutes of 709-90 SURGE Subcommittee Meeting, December 7–9," Share records, 1955–86, NMAH 567, 1959.
26. L.F. Longo, "SURGE: A Recoding of the COBOL Merchandise Control Agreement," Comm. ACM, vol. 5, no. 2, 1962, pp. 98–100.
27. C.W. Bachman, "Share Data Processing Committee Meeting, November 21–22, 1957, Midland, Michigan, SSD 023."
28. Share minutes report that Russell McGee made another progress report and also note his chairmanship of a newly formed "709 Report and File Maintenance Generator Subcommittee." C.W. Bachman, "Report of Share Data Processing Committee," Share, ed., Proc. 10th Meeting of Share, 1958, pp. 51–55.
29. R.C. McGee, "Letter to Jerry Koory, SSD 054," Share records, 1955–86, NMAH 567, 23 June 1959.
30. Specifications for the file-maintenance system are in R.C. McGee, "Preliminary Manual for the Generalized File Maintenance System, SSD 046," Share records, 1955–86, NMAH 567, 23 Dec. 1958. Their shared file structure is described in R.C. McGee, "The Structure of a Standard File, SSD 045," Share records, 1955–86, NMAH 567, 1958.
31. R.C. McGee, "Progress Report on Generalize Routines, SSD 040," Share records, 1955–86, NMAH 567, 7 Nov. 1958.
32. R.C. McGee, "Letter to Charles E Thoma, January 21, SSD 046," Share records, 1955–86, NMAH 567, 1959.
33. With conventional punched-card machines, as long as the same part of the card was always used to code for a customer number, it was easy to use sort and merge operations to create a card file in which each customer record was followed immediately by cards giving information on each order placed by that customer.
34. R.C. McGee, "The Structure of a Standard File, SSD 045," Share records, 1955–86, NMAH 567, 1958.
35. J.T. Horner, "709 Data Processing Package, SSD 035," Share records, 1955–86, NMAH 567, 1958.
36. C.W. Bachman, "INGEN Proposal," Charles W. Bachman Papers, CBI 125, May 1959 or 1960. Bachman clarifies INGEN's fate in C.W. Bachman, "Oral History Interview by Thomas Haigh," ACM Oral History Interviews collection, 25–26 Sept. 2004.
37. IBM, , "IBM 7090 Programming Systems Share 7090 9PAC Part 1: Introduction and General Principles (J28-6166) Bitsavers," 1961; 9PAC_Part1_1961.pdf .
38. W.C. McGee, "Book Review: Installing Electronic Data Processing Systems," Computing News, vol. 5, no. 115, 15 Dec. 1957, pp. 12–14.
39. The discussion of RPG's capabilities is based on IBM, "Report Program Generator: IBM 1401 Card and Tape Systems, J24-0215-2 Bitsavers," 1965; , and IBM, "Report Program Generator (On Disk) Specifications, IBM 1401, 1440, and 1460, C24-3261-1 Bitsavers," 1964; at . Because of its apparently mundane nature, RPG has received a lot less historical attention than its usage would justify. Discussion in the secondary literature seems limited to M. Campbell-Kelly and W. Aspray, Computer: A History of the Information Machine, Basic Books, 1996, p. 133, and Bashe et al.'s IBM's Early Computers (MIT Press, 1986, pp. 479–480).
40. On Mark IV, see J.A. Postley and H. Jackobsohn, "The Third Generation Computer Language: Parameters Do the Programming Job," Data Processing, vol. 11, Data Processing Management Assoc., 1966, pp. 408–415; R.L. Forman, Fulfilling the Computer's Promise: The History of Informatics, 1962–1982, Informatics General, 1984; and J.A. Postley, "Mark IV: Evolution of the Software Product, a Memoir," IEEE Annals of the History of Computing, vol. 20, no. 1, 1998, pp. 43–50.
41. E. Webster and N. Statland, "Instant Data Processing," Business Automation, vol. 7, no. 6,1962, pp. 34–36, 38; N. Statland, and J. R. Hillegass, "Random Access Storage Devices," Datamation, vol. 9, no. 12,1963, pp. 34–45; Anonymous, "Disc File Applications: Reports Presented at the Nation's First Disc File Symposium," Am. Data Processing, 1964.
42. A.L. Norberg, Computers and Commerce: A Study of Technology and Management at Eckert-Mauchly Computer Company, Engineering Research Associates, and Remington Rand, 1946–1957, MIT Press, 2005.
43. W.L. Jerome and L. Hartford, "RAMAC at Work," Systems and Procedures, vol. 8, no. 4,1957, pp. 30–38, and Anonymous, "New Accounting Concept Based on 'Assembly-line' Processing," Management and Business Automation, 1961. Use of the RAMAC is also discussed in R.H. Gregory, "Preparation for Logic—An Orderly Approach to Automation," Management Control Systems, D.G. Malcolm, and A.J. Rowe eds., John Wiley &Sons, 1960, pp. 169–, 183, and Anonymous, "Programming An Information Explosion," Business Automation, vol. 14, no. 5, 1967, pp. 47–50.
44. On the programming of real-time systems in this period and its relationship to hardware features, see W.L. Frank, W.H. Gardener, and G. L. Stock, "Programming On-Line Systems. Part Two: A Study of Hardware Features," Datamation, vol. 9, no. 6, 1963, pp. 28–32.
45. Sabre is discussed in R.W. Parker, "The SABRE System," Datamation, vol. 11, no. 9,1965, pp. 49–52, and D.G. Copeland, R.O. Mason, and J.L. McKenney, "SABRE: The Development of Information-Based Competence and Execution of Information-Based Competition," IEEE Annals of the History of Computing, vol. 17, no. 3, 1995, pp. 30–57.
46. E.W. Pugh, L.R. Johnson, and J.H. Palmer, IBM's 360 and Early 370 Systems, MIT Press, 1991.
47. A good idea of the techniques available to ordinary programmers faced with early random-access storages systems is provided in D.D. McCracken, H. Weiss, and T.H. Lee's, Programming Business Computers, John Wiley, 1959, chapter 19. For its evolution, see R.H. Buegler, "Random Access File System Design," Datamation, vol. 9, no. 12,1963, pp. 31–33, and R.G. Canning, "New Views on Mass Storage," EDP Analyzer, vol. 4, no. 2, 1966.
48. "Management and the Computer: A Wall Street Journal Study of the Management Men Responsible for their Companies' Purchases of Computer Equipments and Services," Wall Street J., 1969, Data Processing Management Assoc. records, CBI 88.
49. Anonymous, "EDP Salary Survey–1969," Business Automation, vol. 16, no. 6, 1969, pp. 48–59.
50. Developments in this area were analyzed in Canning's "New Views on Mass Storage" and R.G. Canning, "Data Management: File Organization," EDP Analyzer, vol. 5, no. 12, 1967.
51. IDS is often, and with some justification, called the first DBMS, but the initial version of IDS lacked some of the features in the Codasyl definition of a DBMS. Record definitions were punched directly onto cards in a special format rather than being defined and modified via a data-definition language. It did not provide an interface for ad-hoc querying, or support for online access, because it was created purely to support MIACS. It did not provide different views or subsets of the overall database to different users. Neither did it support multiple databases simultaneously.
52. An early description of IDS in the context of the integrated systems project is given in C.W. Bachman, "GEICS - General Electric Integrated Computer System," Charles W. Bachman papers, CBI 125, n.d.
53. T. Haigh,, "Inventing Information Systems: The Systems Men and the Computer, 1950–1968," Business History Rev., vol. 75, no. 1, 2001, pp. 15–61.
54. C.W. Bachman, "Oral History Interview by Thomas Haigh," ACM Oral History Interviews collection, 25–26 Sept. 2004.
55. E.F. Codd, "A Relational Model for Large Shared Databanks." Comm. ACM, vol. 13, no. 6, 1970, pp. 277–390.
56. K.R. Blackman, "IMS Celebrates Thirty Years as an IBM Product," IBM Technical J., vol. 37, no. 4, 1998, pp. 596–603.
57. An excellent contemporary introduction to the first public version of IMS is given in J.W. Adams, "IMS - Information Management System/360," Proc. Share 31, vol. 2, Share, 1968, pp. 231–263.
58. R.L. Patrick, "Oral History Interview with Thomas Haigh," Computer History Museum, 2006.
59. Many sessions were devoted to GIS at Share 27 in 1966 as well as 28 and 29 in 1967. By 1969, it had at least some actual users; see J.F. Fry, "Recent User Applications with GIS," Proc. Share 33, Share, 1969, pp. 295–296. Its early capabilities are summarized in Codasyl Systems Committee, "Survey of Generalized Data Base Management Systems," sect. 4, 1969.
60. M. Campbell-Kelly's, From Airline Reservations (MIT Press, 2003, p. 147) claims a 1968 date for Total's introduction, based on an analyst report. However, the Computer History Museum's online Software Histories Collection suggests that Total was developed during 1969 and formally released in Jan. 1970 with advertisements in Computerworld.
61. T. Haigh, "'A Veritable Bucket of Facts': Origins of the Data Base Management System," Proc. 2nd Conf. History and Heritage of Scientific Information Systems, M.E. Bowden, and B. Rayward eds., Information Today Press, 2004, pp. 73–78.
62. SAGE is discussed in P. Edwards, , The Closed World: Computers and the Politics of Discourse in Cold War America, MIT Press, 1996, and T.P. Hughes, Rescuing Prometheus, Pantheon Books, 1998.
63. Anonymous, "A Panel Discussion on Time- Sharing," Datamation, vol. 10, no. 11,1964, pp. 38–44, and System Development Corp., "Preprint for Second Symposium on Computer-Centered Data Base Systems, Sponsored by SDC, ARPA, and ESD," Burroughs Corp. Records, CBI 90, 1 Sept. 1965.
64. C. Baum, The System Builders: The Story of SDC, System Development, 1981, pp. 116–121, and A.H. Vorhaus, "TDMS: A New Approach to Data Management," Systems &Procedures J., vol. 18, no. 4, 1967, pp. 32–35.
65. R.V. Head, "Data Base Symposium," Datamation, vol. 11, no. 11, 1965, p. 41.
66. The phrase "data base management system" was used at least once before the renaming of the DBTG to describe IBM's forthcoming GIS. J.H. Bryant and P. Semple, "GIS and File Management," Proc. ACM 21st Nat'l Conf., ACM Press, 1966, pp. 97–107. List processing seems in retrospect an odd choice, it was perhaps fashionable from its association with work in artificial intelligence.
67. Codasyl Data Description Language Committee, "Codasyl Data Description Language: J. Development," US Govt. Printing Office, 1974.
68. Anonymous, "Codasyl Data Management Report in Print," Data Base, vol. 1, no. 4,1969, p. 4. The report was widely circulated, and large excerpts were published as Codasyl Data Base Task Group, "Data Base Task Group Report to the CODASYL Programming Language Committee," Data Base, vol. 2, no. 2, 1970, pp. 11–18.
69. Codasyl Data Base Task Group, "Codasyl Data Base Task Group: April 1971 Report," ACM, 1971.
70. For example, see T.W. Olle, "Recent CODASYL Reports on Data Base Management," Data Base Systems, R. Rustin ed., Prentice-Hall, 1972, pp. 175–, 184, and a series of other articles as well as participation in a number of discussion panels, such as V.C. Hare, Jr., "A Special Report on the SIGBDP Forum: 'The New Data Base Task Group Report,'" Data Base, vol. 3, no. 3,1971, p. 1. Olle tells the story of this era in T.W. Olle, "Nineteen Sixties History of Data Base Management," Int'l Federation of Information Processing History of Computing and Education 2 (HCE2), vol. 215, J. Impagliazzo ed., Springer, 2006, pp. 67–75.
71. Codasyl Systems Committee, "Survey of Generalized Data Base Management Systems," 1969. This report was until recently hard to obtain outside the archival holdings of the Charles Babbage Inst. but has now been added along with other Codasyl materials to the ACM Digital Library.
72. Codasyl Systems Committee, "Feature Analysis of Generalized Data Base Management Systems," currently distributed by ACM, 1971.
73. V.C. Hare, Jr., "A Special Report on the SIGBDP Forum: 'The New Data Base Task Group Report,'" Data Base, vol. 3, no. 3, 1971, p. 1.
74. Guide/Share Data Base Requirements Project, "Resolution 69-001-00, November 4, 1969," Proc. Share 34, Share, 1970, pp. 697–712.
75. , See material in the proceedings of Share 38 and 40 (March 1973). A good summary of the work of this group through 1972 is given in R.G. Canning, "The Debate on Data Base Management," EDP Analyzer, vol. 10, no. 3, 1972, pp. 1–16. I am not sure whether the group eventually produced actual specifications, but its work does not appear to have had much influence on product developments.
76. The Codasyl Data Description Language Committee's "Codasyl Data Description Language" dates this decision to Codasyl's 10th Anniversary Meeting. It is reflected in the DBTG report issued later that year.
77. The committee's work on the DML first appeared as Codasyl Database Language Task Group, Codasyl Cobol Database Facility Proposal. Ottawa: Dept. of Supply and Services, Government of Canada, Technical Services Branch, 1973 and was subsequently issued in the Cobol J. of Development. Work on these standards continued into the 1980s, first through a new committee set up within Codasyl, and later at ANSI. This included a FORTRAN DML, to complement the Cobol DML in the earlier reports. Codasyl Fortran Data Base Manipulation Language Committee, "CODASYL Fortran Data Base Facility," J. Development, 110-GP-2, 1977.
78. Performance Development Corp., "An Interview With Charles W Bachman (Part II)," Data Base Newsletter, vol. 8, no. 5,1980, Charles W. Bachman papers, CBI 125.
79. C.W. Bachman, "The Programmer as Navigator," Comm. ACM, vol. 16, no. 11, 1973, pp. 653–658.