loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Seventh IEEE International Symposium on Multimedia (ISM'05)
A Pitch-Based Rapid Speech Segmentation for Speaker Indexing
Irvine, California
December 12-December 14
ISBN: 0-7695-2489-3
Min Yang, Zhejiang University, China
Yingchun Yang, Zhejiang University, China
Zhaohui Wu, Zhejiang University, China
Segmentation of continuous audio is an important processing in many applications. In speaker indexing, the reliability of speaker model depends much on segmentation. Commonly used methods are based on the Bayesian Information Criteria (BIC), which is however not so capable when dealing with short utterances. In this paper, we present a pitch-based speech segmentation method, which can detect frequent speaker changes accurately and rapidly. In our algorithm, pitch is introduced in speaker segmentation. Firstly, utterance segments are detected by pitch. Then distances of pitch are computed, and compared with a self-adaptable threshold. Speaker changes are finally decided among utterance segments. We applied our method and three comparative methods on the HUB4-NE broadcast data. Speaker indexing experiments have been taken following each algorithm. We also suggested two indicators as complements of false alarm and missing rate in the evaluation of segmentation. The experiment results show that our algorithm works faster and better, with most of short time speaker changes detected. Speaker indexing equal error rate of our method is 10.43%, which is much lower than 12.94%, 25.84% and 15.91% of other methods.
Citation:
Min Yang, Yingchun Yang, Zhaohui Wu, "A Pitch-Based Rapid Speech Segmentation for Speaker Indexing," ism, pp.571-576, Seventh IEEE International Symposium on Multimedia (ISM'05), 2005
Usage of this product signifies your acceptance of the Terms of Use.