Google. According to Google's Och, there are many potential applications for which the company can use its translation technology. For example, he noted, the company already provides Web-page-translation services. Google says it has several other applications in mind but won't comment further.
Google has several advantages in developing translation technology, said Och. First, Google's large search database contains documents, many in languages other than English, on which to train its translation system. In addition to its own resources, Google has used United Nations documents, which are, of course, in many languages.
Also, Google's server farm, generally acknowledged as one of the world's largest, provides the computing power necessary to effectively train and use its statistical system.
Because of its resources, Google says it could develop a more sophisticated analysis approach. For example, Och explained, while most systems translate based on trigrams, Google can also work with larger word groupings. The ability to collect statistics for larger groupings lets the company's application recognize and predict word patterns more accurately than other systems, according to Google.
National Research Council of Canada. The NRC's Portage system develops statistically likely translations by working with matching sentences in various language pairs, said Roland Kuhn, research officer in the Interactive Language Technologies Group of the council's Institute for Information Technology.
Statistical translation systems typically work from sentence to sentence, handling the N-grams within each sentence. Portage uses an algorithm that looks at material outside a particular sentence being processed to clear up problematic translations within the sentence, particularly for words that can have multiple meanings, Kuhn explained.
University of Maryland. Unlike other statistical translation applications, associate professor Resnik said, the university's Hiero system models hierarchical relationships in a language—such as the parsing of sentences into phrases, parts of speech, and even words—rather than simply stringing together phrases sequentially. This lets Hiero capture linguistically rich aspects of syntactic behavior, he explained.