The audio-based approach to video indexing described here detects music and speech independently even when they occur simultaneously. The indexed video segments, when presented on the Video Sound Browser, let users randomly access the video. The Video in Time system provides different video condensation levels based on video structuring that can link the video segments and the director's intentions.
Akihito Akutsu, Kenichi Minami, Hiroshi Hamada, Yoshinobu Tonomura, "Video Handling with Music and Speech Detection", IEEE MultiMedia, vol. 5, no. , pp. 17-25, July-September 1998, doi:10.1109/93.713301
