The Community for Technology Leaders
Computer and Information Technology, International Conference on (2004)
Wuhan, China
Sept. 14, 2004 to Sept. 16, 2004
ISBN: 0-7695-2216-5
pp: 123-127
Yoshiyuki Toyoda , University of Aizu
Jie Huang , University of Aizu
Shuxue Ding , University of Aizu
Yong Liu , University of Aizu
ABSTRACT
Environmental sound recognition is an important function of robotic audition. Although HMM- or TDNN-based methods can also be used for environmental sound recognition, unlike speech recognition, it is not possible to create a perfect database covering all kinds of environmental sounds. Environmental sound recognition depends more on the robot computer system task. From this point of view, the methods for environmnetal sound recognition must also be task-dependent and be evaluated based on accuracy, speed and simplicity. In this research, we tried to use a multilayered perceptron NN system for environmental sound recognition. The input data is the one-dimensional combination of the instantaneous spectrum at the power peak and the power pattern in time domain. The spectrum of environmental sounds do not change as remarkedly as that of speech of voice, so the combination of power and frequency pattern will retain the major features of environmental sounds but with drastically reduced data. Two experiments were conducted using an original database and a database created by the RWCP. The recognition rate for 45 environmental sound data sets was about 92%. The new method is fast and simple compared to the HMM-based methods, and suitable for an on-board system of a robot for home use, e.g. a security monitoring robot or a home-helper robot.
INDEX TERMS
Environmental sound recognition; Combination of spectrum and power pattern; Robotic audition
CITATION

Y. Liu, J. Huang, S. Ding and Y. Toyoda, "Environmental Sound Recognition by Multilayered Neural Networks," Computer and Information Technology, International Conference on(CIT), Wuhan, China, 2004, pp. 123-127.
doi:10.1109/CIT.2004.1357184
96 ms
(Ver 3.3 (11022016))