, George Washington University
Pages: pp. 74-81
Abstract—Edward Feigenbaum, the recipient of the 2013 Pioneer Award, is one of the founders of the field of artificial intelligence and has been one of its leaders for more than 50 years. He became aware of the birth of AI when he was an undergraduate at Carnegie Tech (later, Carnegie Mellon University). Continuing to graduate school there, he was mentored by Herbert Simon, and he also collaborated with Allen Newell. His career has taken him from Carnegie to the University of California, Berkeley, Stanford University, RAND, and the Pentagon. Feigenbaum has also authored and coedited several seminal AI books: Computers and Thought (1963), The Fifth Generation (1983), and the four-volume Handbook of Artificial Intelligence (1980s). In this interview, David Alan Grier and Feigenbaum talk about his career, his mentor Herbert Simon, and the development of AI.
Keywords—history of computing; Edward Feigenbaum; artificial intelligence; Herbert Simon; Allen Newell; expert systems; Fifth Generation Project; machine learning; knowledge engineering; computer-human interfaces; human-computer interaction
In May 2013, Edward Feigenbaum, Kumagai Professor Emeritus at Stanford University, came to the IEEE Computer Society offices to talk about his career as one of the leaders of artificial intelligence. For his work in the field, Feigenbaum had just received the Computer Society's 2013 Pioneer Award. In this interview, part of which is printed here, we talked about his career, his mentor Herbert Simon, and the development of AI.
Feigenbaum is not only one of the founders of the field of AI. He has been one of its leaders for more than 50 years. He became aware of the birth of AI when he was an undergraduate at Carnegie Tech (later, Carnegie Mellon University). Continuing to graduate school there, he was mentored by Herbert Simon, and he also collaborated with Allen Newell. In 1960, he took his first academic job at the University of California at Berkeley. There he coedited an important anthology of early AI research, Computers and Thought (McGraw-Hill, 1963), which is often called “AI's first book.”
Figure Computer and Thought book cover, circa 1963.
In 1965, he moved to Stanford University's new Computer Science Department. There he began a collaboration with Nobel laureate geneticist Joshua Lederberg, chemist Carl Djerassi, and Bruce Buchanan on work that reshaped the paradigm of AI—expert systems, knowledge engineering methods, and the concepts of knowledge-based systems.
From 1965–1968, he led Stanford's Computer Center. The early 1980s was a busy time for him. He studied and wrote widely about Japan's Fifth Generation Project, which attempted to marry AI with high-speed computing; helped to found the Association for the Advancement of Artificial Intelligence (AAAI); served as AAAI's second president; and led and coedited the landmark four-volume Handbook of Artificial Intelligence (Addison-Wesley).
In the 1990s, he turned to public service, serving at the Pentagon as chief scientist of the Air Force from 1994–1997.
We began our discussions by talking about the time when Herbert Simon introduced him to the ideas of a “thinking machine.”
Feigenbaum: As a undergraduate senior in the fall of 1955–1956 at Carnegie Tech, I took a graduate-level seminar called “Mathematical models in the Social Sciences” from Herbert Simon—polymath, social scientist, behavioral scientist, later Nobel Prize winner in economics, and cofounder of AI.
In the first seminar session after the Christmas and New Year's holiday break, January 1956, Herb opened the seminar by reporting, “Over the Christmas holiday, Allen Newell and I invented a thinking machine.” This comment startled us six students. What did that mean? Thinking? Machine? Those words did not seem to go together.
What Simon was talking about the first fully working AI program. It was named LT, the Logic Theory program. AI scientists know this program as the first heuristic search problem-solving program. It proved theorems (the propositional calculus) in chapter 2 of Whitehead and Russell's Principia Mathematica (three volumes, 1910–1913).
To allow the six of us to understand computing machines, Simon gave us copies of the IBM 701 manual. I took that manual home with me and read it straight through the night. By morning, I was a born-again computer scientist, except there was no such phrase then as “computer scientist.”
I knew two things: I was going to go into this field involving AI and computers, and I was going to come back to Carnegie Tech for graduate school to work with Herb Simon, and Al Newell, on these problems.
Fortunately for me, I was able to attend a graduate student program on computers that summer, 1956, held by IBM—their first such summer program. IBM staff taught me programming for the IBM 650 (which Carnegie Tech was about to acquire) and the new IBM 704.
When I returned to Carnegie, I walked into Herb Simon's office and said, “Okay, here I am. What do I do?” Having cracked open the issue of computer simulation of human information processing, Herb was interested in the modeling of human cognitive processes using computer language as a modeling language rather than mathematical language or English.
He was interested in more than just problem solving—for example, the processes of human memory. That was the problem he posed to me for my graduate student research. This work later became my doctoral thesis. I created a computer simulation model called EPAM (Elementary Perceiver and Memorizer) that dealt with the simulation of phenomena that were well understood by the experimental psychologists of the first half of the 20th century. The experiments constituted a stable paradigm with many stable results that I could use as fixed targets. EPAM hit a lot of those fixed targets with a simple model. The EPAM model structure has lasted more than 50 years and has been active and productive in psychology.
Feigenbaum: Yes, Herb and I collaborated in detail on this EPAM model until 1965, when I moved to Stanford. At that point, I decided to change my focus. I moved from the part of AI that we call computer simulation of cognition—the “psychology side” of AI—to the side we call artificial intelligence—the engineering side. This part of AI aims at programming computers to be not only as smart as people, but much smarter than people.
Figure Feigenbaum at Stanford's Computer Center, in 1966.
When interviewing me for one of her books, the writer Pamela McCorduck asked me, “Can a machine ever be as smart as a person?” I said, “No.” She was startled at that answer because she thought I really did believe that a program could be far smarter than a person. I replied that there will never be a moment at which a machine is as smart as a person because as soon as we know how to make it as smart as a person we engineers will make it smarter. So there's no stopping point right there, it's an unstable point in the engineering.
Feigenbaum: My RAND affiliation started in the summer of 1957. During the 1956–1957 school year, RAND began working on public versions of their list processing languages. These Information Processing Languages, or IPLs, preceded Lisp by three years. The IPLs were languages that ran only on RAND's JOHNIAC computer, a copy of the Institute for Advanced Study computer.
Allen Newell and a few graduate students, including me, began work on a version of IPL that could be used on a wider basis. My job in the summer of 1957 was to go preach those list processing ideas (what would in Silicon Valley language now be called “being an evangelist.”). I was evangelizing at a division of RAND that had become its own corporation, System Development Corporation, preaching list processing languages as the most flexible way of writing programs for air defense.
In the summer of 1958, I went back to RAND to program the IBM 704 and 709 versions of IPL, called IPL-V. By that time Newell had decreed that we would not write code until we had published the manual; then we'd write the code specifically for that manual. That's what I did in the summer of 1958. In the summer of 1959, I did some more of that and also did some coding of EPAM. I was making my final runs on EPAM because I was going to take my doctoral thesis exam when I got back to Carnegie in September.
Feigenbaum: In the early part of the 1960s, I was looking for a project that would put me on the path toward the “ultra-intelligent computer.” I became interested in inductive reasoning as compared with deductive reasoning. In 1963, I described inductive reasoning as the next step in AI in the introduction to Computers and Thought.
I was looking for an experimental vehicle—sometimes I call it a playpen—in which to examine ideas about inductive thinking. In particular, I decided to look at the inductive thinking of those people who are professionals at doing that kind of thinking, namely scientists. That's what they do. They induce hypotheses from empirical data. Fortunately, with tremendous good luck, I met Joshua Lederberg, a Nobel Prize winner in medicine for his breakthrough work in molecular genetics.
Lederberg was interested in exactly the same subject, the computer modeling of scientific thinking. He was changing interests from mainline genetics to instrumentation and computation. He said, let's try it. The problem he suggested involved the inductive interpretation of organic molecular structure. Our goal was to write a program that could interpret experimental data in the Stanford Mass Spectrometry Laboratory, headed by our chemistry collaborator Carl Djerassi (yet another genius).
We really had quite a team working on the DENDRAL project: excellent PhD students and post-docs and excellent young computer scientists, foremost among whom was Bruce Buchanan. By 1970, we had made tremendous progress on the problem and pretty much formulated how to do knowledge engineering, which is the fundamental set of concepts and methods behind expert systems.
Around 1972 came our first application to clinical medicine: the Mycin conversational system for diagnosing blood infections (Ted Shortliffe's thesis). Throughout the 1970s we challenged ourselves in many other areas. I did a military application to interpret coastal sonar data to infer (induce) what Soviet submarines were patrolling the West Coast and what they were doing. That application, HASP, was classified at the time.
We did engineering applications to civil engineering and x-ray crystallography. In Silicon Valley style, we started several companies, beginning with IntelliGenetics and later IntelliCorp. Then, Teknowledge and later Design Power, specializing in applications of expert systems to the engineering design of boilers.
Figure Japanese poster announcing a Feigenbaum lecture.
Feigenbaum: My interest in the Fifth Generation Project began in 1979 when I was teaching at the University of Tokyo for one school term. But my interest in Japan began much earlier, in 1970, when I met my wife, who's Japanese.
I was much taken with the scope of the Japanese government's vision of what they wanted to achieve in artificial intelligence ina 10-year project. When my book with Pamela McCorduck, The Fifth Generation (Addison-Wesley Longman, 1983), was published, it was translated immediately into Japanese, so I became celebrity in Japan because I was saying a lot of nice things about their work.
Figure Feigenbaum with Tom Rindfleisch, circa 1978.
Feigenbaum: Very far back. When I arrived at Berkeley I was in the School of Business Administration. Julian Feldman and I were both in the Management Science Research Center. We both got grant money to pursue our research.
One of the things you do when you get a little grant money is to hire a secretary to help you. In this case, we needed an especially good and literate one because we were putting together Computers and Thought. We hired Pamela in her junior year as an English major at Berkeley. She worked for us until after the book was published and then left to work somewhere else.
When I moved to Stanford I invited her to come to Stanford to work as my secretary there, which she did. Later, she decided she wanted to be a professional writer and went off to do a master's degree at Columbia. So we've had a long collaboration, which included The Fifth Generation.
The Fifth Generation was well received partly because it had an interesting narrative—Japan trying very hard to catch up. Japan was starting a new project to focus on AI and parallel computing, both of which were hot subjects in the United States.
The Fifth Generation actually tells two important stories, one about ARPA (the Advanced Research Projects Agency, later renamed DARPA) and one about Japan. ARPA began funding artificial intelligence research in 1963. My first substantial support at Berkeley came from ARPA. I thought the ARPA story was marvelous—how ARPA support evolved and was so important to the US—and the American public didn't know about that. The Japanese public basically didn't know that scientists in their government, at MITI's Electrotechnical Lab, were doing such good thinking in computer science. So it was interesting all the way around.
Feigenbaum: The success was that they trained a generation or two of young researchers in AI and computer design. They learned how to do parallel computing, how to do AI, and what it means to build an AI system. All of that penetrated right back into industry because the project work was being done by engineers that came from industry: Hitachi, Fujitsu, Toshiba, and other industrial giants.
I can tell you first where the project fell short and then why. They worked hard on building a high-speed Prolog or “logic programming” machine and never developed application-based methodologies. I spoke to them twice about this issue. The first time was when the project began. The second was after they had been working for five years. I told them that they had to learn knowledge engineering and knowledge representation. They had to work on knowledge representation schemes and acquiring knowledge from expert humans, tailoring all this to the needs of specific knowledge-based applications. They needed to be more empirical.
For the first five years they didn't do that. They just went along doing their engineering at a level too abstract. What is Prolog? How can I make the Prolog language better? How can I make a super Prolog? How can I cast a super Prolog into machine logic? And how can I make it run fast (the parallel computing work)?
By the end of year five they didn't have much to demo. They really started on their demo six months before the “big demo show” was supposed to occur. Later rather than earlier they got the message, and they worked hard on those demos. They did get some good results, by the end of year 10! It was too late by that time. They had consumed the amount of government money that the rest of the science and engineering community would allow them to consume because everyone else wanted the government money for other initiatives.
Feigenbaum: I'm so happy to talk about Harry, who is now close to 100 years old. Harry had become a professor at the Electrical Engineering Department at Berkeley. Remember I was in the Business School, but there were few people interested in computing in the Business School, so I had to find my connections elsewhere. Harry was one of them.
When J.C.R. Licklider started the Information Processing Techniques Office at ARPA, he talked to Herb Simon and asked who he should give research money to. Herb said Ed Feigenbaum and Julian Feldman, among others. Licklider asked the same question of his contacts in the computer hardware community, and they said Harry Huskey.
Licklider wanted a computer halfway between the small time-shared minicomputers, such as the DEC PDP-1, and the huge time-sharing systems like Project MAC at MIT. He wanted one built out of a next-generation, more powerful minicomputer. He gave that task to Huskey. Huskey gave that task to his younger colleague David Evans and went on sabbatical to India. So for a year I, as coprincipal investigator with Dave Evans, learned how to design a computer system architecture!
Feigenbaum: To be honest I haven't read my Turing Award speech since then, but I do remember a couple of key things that I wanted to say in that paper. I think that those ideas are correct, but alas, I must admit that I also failed to predict in any way, shape, or form the direction in which AI really went in the first decade of the 21st century—namely, to statistical machine learning. That change was not predicted in my Turing Award speech nor in Raj Reddy's speech on the same day.
What I did say was that a good way to think about where AI fits into the entire spectrum of information technology and computer science is what I call the “What to How Spectrum.” The How end of the spectrum is about the way a computer really works. It deals with little instructions, like Clear and Add, Shift Left N—dozens of tiny steps, even in the firmware. As we moved from how to what, we got Fortran, which allowed us to express computational needs in formulas—algebraic language. This moved us a little bit away from the how end of the spectrum. Then we got business-oriented languages. Then we got domain-specific languages, like ICE, which was a civil engineering language. Later, we got object-oriented languages.
Far from there, at the What end of the spectrum sits AI. At that end, you, the user, tells the computer what it is that you want it to do, what your goals are, in free flowing natural language, with all of its ambiguities and subtleties. The AI programs must have the knowledge, reasoning power, and heuristics to employ to achieve these goals for you so you don't have to be a programmer.
Recently, we've seen examples of this. These days you can pick up an iPhone and can ask a knowledge-based AI program called Siri to do something for you on the iPhone, like set an alarm for a certain time tomorrow morning. Of course you don't need Siri to do that. You could read the manual and figure out the screen touches to get the alarm to come up and how to set it and all that. But instead you just say what it is you want, and you say it in whatever natural languages Siri understands.
Expert systems aren't at the What end because most expert systems don't support conversation in a free-flowing style about our specific goals requiring expertise.
The other point I wanted to make in the Turing Award lecture was that AI programs behave intelligently to the extent that they know a great deal about the domain of discourse in which they are being asked to perform. For the most part, cognition is not based on deep reasoning; it's based on broad knowledge. In the case of expertise as captured in expert systems, it is both broad and deep knowledge in a particular domain.
Here's another way to say it: it is more important for an AI program to know a lot than to think deeply. This is the Knowledge Principle in AI. With essentially one major exception, it's proved to be universally true.
The one major exception occurred when a high-speed chess machine, IBM's Deep Blue, beat Kasparov, the world's chess champion, largely by brute-force searching with only a small amount of specialized chess knowledge. Deep Blue examined approximately 200 or 300 million paths per move whereas Kasparov examined perhaps 200 to 2,000 well-selected paths per move.
So we have a data point “off the curve” that I choose not to ignore: a chess program that doesn't have a lot of knowledge but does a great deal of computing. It's a place to examine the possibilities for intelligent search using the extremely high-speed computers of today and tomorrow. What other problems can we solve the way that the IBM researchers solved the problem of playing chess? Can we elicit creative ideas from a program by searching huge combinatorial spaces, spaces that people are not good at exploring?
You can't treat creativity as an abstract concept. A creative act is just a behavior with special characteristics. It is a behavior that is novel and perhaps startling to people. It may even be “new to mankind,” never before seen. It may be elegant or beautiful, as evaluated by people. There are several instances of that that have been done.
For example, take the AI program we talked about, the Logic Theory program. AI's first program proved a theorem in chapter 2 of Principia Mathematica much more elegantly than Whitehead and Russell proved it. The exchange of letters between Simon and Russell is remarkable, as exhibited in Herb Simon's autobiography. Russell says he was prepared to believe that a computer can do anything, and why did he and Whitehead waste all the time doing this work? This level of behavioral excellence (creativity?) was done with AI's first program that was not able to do a lot of combinatorial search because it was running on a 40,000 operations per second computer with 4,000 words of memory.
Feigenbaum: Watson found ways to deal with an immense amount of “surface-level” knowledge—the immensity of knowledge that is available over the Internet. I have heard the claim, and find it plausible, that Watson had access to 1 trillion individual “items” of knowledge.
The great thing about the software architecture of Watson is that it does not rely on any one algorithmic or heuristic method to make its decisions. Watson's builders used what I like to call a “hybrid architecture.” I did the same thing when my team and I designed the expert system for detecting Soviet submarines. Marvin Minsky has an imaginative name for hybrid architectures. He calls them the “kitchen sink model of the mind.” Anything you can think of that will work, put it to work, and the Watson people did.
It took them a few years to arrive at a high level of performance, but the performance was slow. Finally David Ferrucci said, “Okay that's good enough. Now let's just make Watson respond within three seconds.” Watson engineers threw 2,800 high-speed boards at the speed-up problem. That is, IBM said, “We'll build a supercomputer for Watson so that it could press the Jeopardy button with a good answer within three seconds.” The developers used a beautiful engineering method. The AI and system engineering was very analytic, very systematic, using lots of imaginative techniques. Great work!
Feigenbaum: Let me just mention a few. But first, keep in mind that intelligence, hence AI, is a multidimensional concept. It goes from perceptual things like speech and vision, on the one hand, all the way to deep thinking—serious quality thinking, best expertise in the world—on the other hand.
If you look at just the Stanford AI lab working on these things in the mid-to-late 1960s, you'll find the robotics work included the first vision system, the first mobile system, the first assembly system, and a system that could assemble a water pump. At Stanford and CMU you'll find Raj Reddy's work on speech understanding. You'll find the early work on expert systems.
By now, in the second decade of this century, speech understanding is a major triumph. AI's work on vision—what computer vision systems are doing these days—is spectacular. The work on expert systems was very influential in IT. There were tens of thousands, maybe hundreds of thousands of them built. If you want to check it out, do a Google search with the phrase “business rules” and see the many business rule systems that have been built. (“Business rules” is a phrase invented by expert systems software companies to market their products.)
IBM's Watson, as I said, is remarkable. So was the 2005 work of Sebastian Thrun and his group, at Stanford (later Google), on the Stanford self-driving car. The car's performance was utterly, unexpectedly awesome, winning the $2 million Darpa prize. They built a great integrated system using both AI and other engineering methods.
Heuristic problem solving is a major concept. That is an excellent model of how everyone thinks. Last but not least is the surprising surge of statistically based machine learning algorithms and systems. These have outperformed anyone's expectations, for reasons that no one really understands deeply. There's a great set of doctoral theses to be written about why this works at all. But as engineers know, there were a lot of good radios built before engineers figured out radio theory.
Feigenbaum: I'll state it and then I'll explain why. I already made the case, at a private conference that was convened in the early part of this century by DARPA, for what AI research DARPA should be supporting. My idea was the one that was voted by the group of eminent AI researchers as the most important.
We need software for knowledge acquisition for AI programs by reading text, reading books, reading the Web. Not by painstakingly doing knowledge engineering as in the past. One knowledge engineer, one expert going over the individual cases? That's not the way people built their culture.
We have succeeded so well as a species because we found a way to use language to record our experiences and thoughts, to write them, to pass them on to the next generation. Two of the greatest inventions of all time were writing and printing. We move our culture to the next generation mostly by reading text. These days the text is in either “atoms” or “bits.”
Why do we have to get programs to read from text? Because in knowledge lies the power. It's that Knowledge Principle I talked about before. AI systems will not become intelligent until they are widely knowledgeable.
One of the things that we don't know how to do well yet is to accumulate immense amounts of what Douglas Lenat calls “common sense knowledge,” the knowledge of ordinary things. This is the “glue” that helps the knowledge of specific domains to work well and robustly. We'll get that from reading text, just as we will read and acquire domain knowledge.
Figure Edward Feigenbaum and David Alan Grier at IEEE Computer Society offices in Los Alamitos, California, May 2013.
Feigenbaum: Recently I was involved in a study for the Air Force on future technologies. I once served as chief scientist of the Air Force so I get involved in long-range forecasts of scientific things for the Air Force.
In scanning all the different science and technology horizons that the Air Force is looking at for the next 10, 20, and 30 years, I noticed that the people writing those forecasts, the current generation of scientists, engineers, and users, were not extrapolating enough the great change that enormous amounts of computation coming up in the next 10 to 30 years will make to all fields. It was almost as if they were giving five-year projections. They weren't absorbing the fact that computing changes everything, often profoundly.
I think your wider audience needs to consider profound change. For example, the primary tool of the new physics today is computers, not mathematics. Physics is no longer the mathematics that gets scribbled on big yellow pages like Einstein did.
What is not understood is that there will be coming, in the next 10 to 20 years, some really excellent AI-enabled computer-human interfaces in which computers can do vastly better things than they are currently doing in the service of human work. And people can do whatever residual work there is that people do best.
These interfaces will allow that mixture of human-computer interaction, not just where the machine is serving the person but where the human and the computer are cooperating on a task. This will have much greater consequences than most people today understand.
Feigenbaum: Thank you for allowing me to express my views.