Edgar (Ted) Codd, the mathematician and former IBM Fellow best known for creating the “relational” model for representing data that led to today’s $12 billion database industry, died Friday, April 18, at his home in Florida at age 79.
Most of the innumerable data transactions we routinely make today — using bank accounts and credit cards, trading stock, making travel reservations and participating in online auctions — use relational databases based on the abstract and sophisticated mathematical theory that Codd first published in 1970 when he worked at IBM’s San Jose Research Laboratory, the forerunner to today’s Almaden Research Center.
It didn’t come easily, however. The computing landscape in the early 1970s was a far cry from the gigahertz, terabyte and petaflop scene today.
Computer calculations cost hundreds of dollars a minute, so great human effort was spent to make programs as efficient as possible before they were run. Early databases used either a rigid hierarchical structure or a complex navigational plan of pointers to the physical locations of the data on magnetic tapes. Teams of programmers were needed to express queries to extract meaningful information. While such databases could be efficient in handling the specific data and queries they were designed for, they were absolutely inflexible. New types of queries required complex reprogramming, and adding new types of data forced a total redesign of the database itself.
In his landmark paper, “A Relational Model of Data for Large Shared Data Banks,” Codd proposed replacing the hierarchical or navigational structure with simple tables containing rows and columns.
“Ted’s basic idea was that relationships between data items should be based on the item’s values, and not on separately specified linking or nesting. This notion greatly simplified the specification of queries and allowed unprecedented flexibility to exploit existing data sets in new ways,” said Don Chamberlin, co-inventor of SQL, the industry-standard language for querying relational databases, and a research staff member at Almaden. “He believed that computer users should be able to work at a more natural-language level and not be concerned about the details of where or how the data was stored.”
At a 1995 reunion of IBM’s early relational database scientists, Chamberlin recalled having an epiphany as he first heard Codd describe his relational model at an internal seminar.
“Codd had a bunch of fairly complicated queries,” Chamberlin said. “And since I’d been studying CODASYL (the language used to query navigational databases), I could imagine how those queries would have been represented in CODASYL by programs that were five pages long that would navigate through this labyrinth of pointers and stuff. Codd would sort of write them down as one-liners. … (T)hey weren’t complicated at all. I said, ‘Wow.’ This was kind of a conversion experience for me. I understood what the relational thing was about after that.”
The idea of relying only on value-based relationships was quite a radical concept at that time, and many people were skeptical. They didn’t believe that machine-made relational queries would be able to perform as well as hand-tuned programs written by expert human navigators. But the increasing power of newer computers, the random-access nature of magnetic hard-disk drives, and a long string of software innovations enabled scientists to bring the advantages of Codd’s idea to customers.
IBM Research News intranet article published on April 23, 2003.