loading...
October 2009 (Vol. 42, No. 10) pp. 20-23
0018-9162/09/$26.00 © 2009 IEEE

Published by the IEEE Computer Society
News Briefs
System Takes New Approach to Speech Search
A new system promises to make it easier to search audio clips for specific phrases or names than using traditional speech-recognition software.
The AudioMining software that the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) developed would let, for example, a TV reporter quickly and accurately search a video clip for a specific part in which an interview subject said something important.
AudioMining software would let users search for terms within audio files without depending on a database of words that must be updated regularly, as is the case with standard speech-recognition systems.




The Fraunhofer Institute for Intelligent Analysis and Information Systems has developed a system for searching audio clips for specific phrases or words. When a user submits a search query, the system breaks down the query's words into syllables. It then segments the audio data based on different speakers, words, and syllables. It uses a syllable dictionary to find the syllables in the speech that match those in the query. Traditional speech recognition uses a word dictionary, which is more difficult to work with.



Typically, automatic speech recognition (ASR) systems use a word dictionary and a statistical model for the typical usage of word sequences to produce a transcription, which then can be searched, said IAIS scientist Daniel Schneider.
The word dictionary contains only a limited number of words and names, so there are many that the ASR system won't recognize. Regularly, specialists must update the systems' name, word, and phrase database, a time-consuming and expensive process.
Schneider said his research team built AudioMining with a dictionary based on syllables, rather than words. The system breaks down word queries into syllable sequences and searches for matches in the syllable dictionary.
With about 10,000 stored syllables, the system can recognize any word, Schneider explained.
The approach analyzes an audio stream and uses efficient probabilistic algorithms, as well as "knowledge" about parts of speech, to segment the file. This lets the system identify different speakers and break down words into syllables, explained Schneider.
AudioMining uses n-gram language modeling, a probabilistic model for predicting language sequences, to more accurately identify syllables.
Users can add traditional word-recognition technology to improve accuracy.
AudioMining could be used to either analyze already-recorded speech—such as recordings of lectures and conferences—or monitor ongoing audio, such as a TV broadcast.
In testing, Schneider said, the system took only milliseconds to find 85 out of 100 searched-for utterances in audio files, with 99 percent of results being correct.
The Fraunhofer system doesn't perform speech recognition, as it doesn't identify what words mean, explained University of Calgary professor emeritus David Hill, a speech-recognition expert.
The system thus could be useful for searching for phrases in a library of spoken material but not for transcribing streaming audio, he said.
According to Schneider, his team is currently commercializing its technology. For a project the researchers are developing with Germany's ARD broadcast network, they built a system—with a public interface—for searching and comparing spoken quotes from German politicians during the 2009 election campaign.
Program Uses Mobile Technology to Help with Crises
A nongovernmental organization has released open source software tools for collaboration and communication that let government agencies or humanitarian organizations quickly report, share, aggregate, and analyze important data via a cell phone.
This is part of the InSTEDD (Innovative Support to Emergencies, Diseases, and Disasters) NGO's project to use mobile technology to improve governments' and humanitarian agencies' ability to respond to health problems, disasters, and other regional crises.
InSTEDD's tools provide information on disease-outbreak epidemiology and natural disasters, and help humanitarian organizations collaborate to improve the services they offer.
Many health workers are isolated by distance or terrain, potentially making communications time-consuming and difficult. Thus, the ability to use fast, simple communications technology that works with basic cell phones is valuable, according to InSTEDD president and CEO Eric Rasmussen.
The GeoChat short-message-service tool lets NGO field workers or first responders to disasters use cell phones to send information via an SMS message to the groups for which they work, government agencies, or even international bodies such as the United Nations' World Health Organization.
For example, in GeoChat's first testbed, in Cambodia, eight district health officers utilize the tool to share observations, diagnoses, and disease-outbreak epidemiology reports with a provincial hospital, which could use it to report on a rapid response team's estimated arrival times to areas with problems.
Field workers include their locations with each message, which pops up as a conversation thread on an interactive map. The messages either go directly to an NGO or to the InSTEDD website, where users with the correct passwords could pick them up.
GeoChat works via SMS because it's convenient and because, in many areas, Internet communications aren't available, according to Rasmussen.
SMS requires payment by users. InSTEDD works with the PayPal online payment system and makes it convenient for organizations by, for example, letting the groups prepay for messages so that field workers don't have to worry about such matters.
When combined with InSTEDD's Mesh4X synchronous-communication tool, users can transmit data between established applications'such as Access, Excel, GoogleEarth, MySQL, and the Oracle Database'and between devices'like laptops, smart phones, PDAs, and servers'reliably, selectively, and securely in a distributed data mesh.
InSTEDD's Riff analytics tool can help governmental agencies and humanitarian organizations examine incoming data and make decisions, Rasmussen said.
InSTEDD is helping humanitarian organizations and government agencies deploy its free tools worldwide. Beta versions are already deployed in countries such as Bangladesh, Denmark, Ghana, Tanzania, and the US.
Rasmussen said the technologies could also be used for specialized commercial software applications. InSTEDD is currently negotiating agreements with several companies.
Google's philanthropic arm, Google.org, contributed $5 million to InSTEDD through 2008 and has just started funding a three-year, $6.67 million grant.
There is a huge need to improve collaboration and communications for humanitarian relief organizations responding to natural disasters and health crises, explained Google.org director Frank Rijsberman. He said InSTEDD is perhaps the only nonprofit organization able to accomplish this by combining the necessary software-engineering skills with a deep understanding of humanitarian relief organizations.
Researcher Develops System for Distributed Debugging
Austrian researchers have developed an approach for the challenging task of debugging distributed systems.
Debugging is a critical process in the creation of any computer system. This has become increasingly important as systems have become more complex. For example, debugging is crucial for real-time, distributed systems, in which disparate machines must operate quickly and correctly in unison.
Debugging is significantly different for such systems than for conventional ones. Finding bugs or reproducing problematic scenarios in a distributed system frequently requires complex coordination, said professor Roland Höller with the University of Applied Sciences Technikum Wien.
Typical approaches to distributed debugging have linked systems' disparate elements with cables to an embedded control unit that connects via Ethernet or USB to one or more PCs running debugging software.
These approaches are often impractical because the elements of many distributed systems that must be debugged are embedded inside equipment such as industrial machinery or an automobile and are not easily accessible.
Höller's approach'which works with conventional debugging software'puts the debugging circuitry on a system element's chip, rather than an external device. The user issues the command to begin the debugging via a wireless or wired network, enabling the process to work even with difficult-to-access system elements.
According to Höller, his approach addresses distributed debugging by synchronizing the system clocks within each of the elements.
The synchronization enables the technique to coordinate even complex debugging activities across the distributed system by having each step run at the same time. This lets users know how the system being debugged will operate when it is actually running, with all elements functioning simultaneously.
Höller said the challenge in developing his system was tightly integrating the clock synchronization with the debugging system.
The research team is running a prototype of its approach using both the Eclipse Foundation's Eclipse debugging software and the GNU Debugger.
His team plans to begin working to make its approach a debugging standard with several organizations, including the Nexus 5001 Forum, which develops embedded-processor debug-interface standards; and the Open Core Protocol International Partnership.
Höller said his team has patented and wants to commercialize its technique but will have to convince chip makers to provide valuable space on their processors for the debugging circuitry.
News Briefs written by Linda Dailey Paulson, a freelance technology writer based in Portland, Oregon. Contact her at ldpaulson@yahoo.com.