Pages: pp. 4-7
Shimon Edelman was an electrical engineer until the early 1980s, when he read Gödel, Escher, Bach: An Eternal Golden Braid (Basic Books, 1979). Even today, Douglas Hofstadter's Pulitzer Prize-winning examination of mathematical patterns in human cognition remains a recruiting manual for those who would pursue artificial intelligence (AI) research.
While on sabbatical, Edelman recounted how the book inspired him to follow a path to his current role as computer scientist and psychology professor at Cornell University. He was enthralled by the idea that the creative works of the mathematician Gödel, the artist Escher, and the composer Bach were all really manifestations of the same underlying rules that govern human consciousness. The book suggests that intelligence could, to some extent, be programmed if those underlying rules were known. "There are any number of people in my field who claim that they are there just as I am—because of that book," he says. "So I got hooked on computer science and took the roundabout route to psychology."
Edelman started working on machine vision, then decided to develop language-learning computer models because it seemed like a more challenging pursuit.
It's also a more controversial pursuit. His latest software program, Automatic Distillation of Structure (Adios) uses statistical algorithms to infer the underlying grammar rules in text, and then uses those rules to generate new and meaningful sentences. Still, some linguists reject Edelman's claim that he and his colleagues have taken the most basic steps toward replicating human language acquisition. Many linguists who believe that we are born with grammar already hardwired in our brains—that we all hold a kind of universal knowledge base for language, and the words we learn as we grow are just refinements to that base. In contrast, Adios starts with a blank slate.
Even if Adios's methods can't be called human-like, the program could still be an important building block on the way to AI. It composes sentences in English and Chinese. It dabbles in bioinformatics by detecting patterns in the letters that spell out genetic code. It's even learning to write music.
Adios maps out sentences in a tree structure, with each word on the tree representing a node. Given an entire body of text—what linguists call a corpus—the software detects when the same phrase is used in different contexts and aligns the sentences together. It connects new words by adding branches to the tree.
"If you have two sentences, and one contains the phrase 'a green shirt' and the other contains [the phrase] 'a red shirt,' these phrases align in two places, and there's a mismatch in one place. You can think of the mismatches as slots in an emerging data structure," Edelman says. "Then the structure 'the ___ shirt' is a unit with an open slot, and what can go in the slot is either 'green' or 'red,' but not both." The result is a forest of tree-shaped diagrams of different shapes and sizes, all united under the same lexicon.
He says the algorithm that performs the alignment is very simple. What makes Adios special is that a second algorithm, called the Motif Extraction Procedure (Mex), gauges which phrase alignments are statistically significant enough to warrant inclusion in the expanding lexicon. As each branch is added to a tree, Mex assigns it a probability, and Adios uses that information to compute new word sequences that are more likely to fit within the grammatical structure of the original corpus. Adios builds its own grammar from the bottom up, based on whatever corpus it's given.
In a recent issue of the Proceedings of the US National Academy of Sciences (vol. 102, no. 33), Edelman and his Tel Aviv University colleagues, Zach Solan, David Horn, and Eytan Ruppin, reported testing Adios on three databases. One database contained conversational sentences about air travel, another contained transcripts of conversations with children, and the last contained Bible passages in several languages. Depending on the corpus size, Adios was able to construct original sentences that fit within the appropriate grammar structure anywhere from 40 to 99 percent of the time. The more text it had to work from, the better it did. "We don't have something that performs as well as a baby at acquiring language, but I think we're on the right track," Edelman says.
Adele Goldberg, professor of linguistics at Princeton University, says these findings come at a turning point for the linguistics field. Over the past 10 years, a growing number of researchers have suggested that children aren't hardwired for language, but rather learn by listening to people and assigning probabilities to words—like Adios does. "People are changing their point of view," Goldberg says, "so that instead of assuming that language is unlearnable, they're looking at how it could be learned." Edelman's software demonstrates that transitional probabilities can be used to build grammatical structures within a hierarchical structure, she says. "That's what language is."
Adios has more immediate applications for data mining and bioinformatics. Given six translations of Bible text, it correctly mapped the relationships among the languages; it found, for instance, that Latin-based languages such as Spanish and French are more closely related to each other than they are to a German-based language such as English. Then it mapped the structural relationships among nearly 7,000 enzymes in a protein database. In that case, the "words" Adios gathered were each protein's building blocks. Adios was then able to predict relationships that mapped the enzyme function with 95 percent accuracy.
Like languages and protein sequences, musical compositions also contain repeating patterns. Hofstadter's book, which wooed Edelman away from electrical engineering so long ago, revealed that Bach enjoyed weaving complicated rhythmic and tonal patterns into his works. Edelman confirms that Adios should be able to sort musical notes as well as nucleotides.
In fact, he says, undergraduates at Tel Aviv University have fed Israeli folk songs to Adios, and the program is beginning to write its own music. Here the corpus consists of very simple data. The notes are stored in musical instrument digital interface (MIDI) format, the same protocol that many cell phones use to play ringtones. "It sounds kind of impoverished," Edelman admits, "but the tunes are there." Early reviews say the software's compositions are derivative—the new songs sound very similar to the old ones—and the arrangements are very simple. Bach it isn't, but it's a start.
Mex enables Adios to operate without any assistance—it doesn't need a dictionary or any grammatical ground rules to start building a lexicon, just sentences—and Edelman says that's what sets it apart from other language-learning software. Not that he thinks that his program is particularly smart.
"By definition, all algorithms are stupid, right? What's smart is what's left out," he says. Whereas true AI assigns meanings to words, Adios can't. "Whatever meaning it gleans from the text comes in the form of equivalences. So it wouldn't know what 'red' means, but it knows that the meaning of 'red' is related to that of 'green' because they're in the same equivalence class [in the shirt example]. And that can take you a long way." Because Adios doesn't need detailed inputs of text before it starts learning, it should be able to easily adapt to different languages and contexts, Edelman says.
Alex Waibel, a professor at Carnegie Mellon University's Language Technologies Institute, takes the opposite approach when he and his team build software that translates human speech (see the " Lunch in Translation" sidebar). He pre-trains his programs on massive amounts of bilingual text to make the translations more accurate. He suspects that more training is better for these kinds of applications. "But who knows … an interesting possibility for unsupervised or lightly supervised algorithms may be given for languages for which less training data is available," Waibel says.
Edelman and his partners are patenting Adios, and they're conscious of the possible commercial applications in speech recognition, automated translation, and human-machine interfaces. People-friendly computerized travel agents and other service-oriented systems are a strong possibility. And if scientists have better tools for mapping biological organisms, as Adios did with the protein sequences, they could find relationships among plant and animal species that could lead to new drug therapies (see the May/June 2005 issue of CiSE for a discussion on the challenges involved in building phylogenetic trees).
Ultimately, Edelman would like to see programs like Adios gain sufficient capacity to run continuously and soak up data from any number of language sources. "Then the optimal thing would be to feed it sensory inputs such as vision, and that would pave the way someday to real AI," he says. "It's a long-term goal."
On a bright fall day in Pittsburgh, Penn., Alex Waibel is realizing that he can't talk technology and eat lunch at the same time. He's passing around an iPAQ pocket PC that translates speech into different languages. Waiters try to take away his soup—and then his filet—before he's able to take a bite. He manages to hold onto his food, but it sits untouched because everyone at the table wants a demonstration. Waibel holds the device to his mouth and says, "I'm hungry."
In a few seconds, the words scroll across the top of the screen in English. After a few more seconds, a curly script scrolls along the bottom. A synthesized voice gives the translation in Thai: "Phom hue."
Teaching the program grammar would take too much time, Waibel says, so he and his team at Carnegie Mellon University (CMU) train the speech-recognition algorithms with voice data, and the translation algorithms with bilingual texts that are available on the Internet. "It generalizes from the experience of past translations," he says. "That still leaves a challenge, because it only works for languages for which there is a lot of data."
The translator is the most low-tech device to come out of Waibel's lab. One device enables "silent" speaking in front of a crowd; while a person mouths words, electrodes attached to his or her facial muscles transmit data to software that determines the words. The audience hears only the computer translation. He's also developed ultrasonic speakers that translate and broadcast only to specific individuals who need to hear the translation.
In October 2005, Waibel demonstrated his most sophisticated communication system yet: as he gave a lecture in English at CMU, an audience at the University of Karlsruhe heard the speech in German. Performing both speech recognition and translation is difficult, even for software backed by a small PC cluster, so the translation wasn't simultaneous. But it worked, and Waibel hopes that his technology will someday provide simultaneous real-time translations in several languages for organizations such as the United Nations. He sees applications for handheld devices in tourism, medicine, and the military, as well.
To him, this work is all about bridging cultures. "I was born in Germany, raised in Spain, studied in the US, met my wife in Japan, and travel a lot in France," he says. "I think helping people communicate with each other is a good way to have a positive impact."