This Article 
 Bibliographic References 
 Add to: 
Recognition of Complex Settings by Aggregating Atomic Scenes
September/October 2008 (vol. 23 no. 5)
pp. 58-65
Waltenegus Dargie, Technical University of Dresden
Tobias Tersch, Sidon Software and Engineering Service-Providing Company
Researchers have employed audio features to capture complex human settings. Most approaches model a complex setting as a monolithic scene; that is, they consider the stochastic property of the audio signal representing a setting as a whole, not an aggregation of distinct scenes. So, when some aspects of the training data are missing or are weakly represented in the test signal, recognition schemes trained to recognize the setting often make erroneous conclusions. Moreover, these approaches make it difficult to declaratively define new settings by combining scenes. A proposed conceptual architecture enables recognition of complex settings by combining scenes. The associated architecture and modeling approach help achieve human-like reasoning and improve recognition accuracy. The authors demonstrate their approach by modeling seven everyday settings with 27 atomic scenes.

1. H.-Y. Chang et al., "Performance Improvement of Vector Quantization by Using Threshold," Advances in Multimedia Information Processing—PCM 2004, LNCS 3333, Springer, 2005, pp. 647–654.
2. V. Peltonen et al., "Computational Auditory Scene Recognition," Proc. Int'l Conf. Acoustic Speech and Signal Processing, 2002.
3. D. Heckerman, "A Tutorial on Learning with Bayesian Networks," Learning in Graphical Models, M.I. Jordan, ed., MIT Press, 1999, pp. 301–354.
1. N. Moeënne-Loccoz, F. Brémond, and M. Thonnat, "Recurrent Bayesian Network for the Recognition of Human Behaviours from Video," Proc. 3rd Int'l Conf. Computer Vision Systems (ICVS 03), Springer, 2003, pp. 66–77.
2. H. Wu, "Sensor Data Fusion for Context-Aware Computing Using Dempster-Shafer Theory," PhD thesis, Dept. of Computer Science, Carnegie Mellon Univ., 2003.
3. J. Mäntyjärvi, J. Himberg, and P. Huuskonen, "Collaborative Context Recognition for Handheld Devices," Proc. 1st IEEE Int'l Conf. Pervasive Computing and Communications (PerCom 03), IEEE Press, 2003, pp. 161–168.
4. V. Peltonen et al., "Computational Auditory Scene Recognition," Proc. Int'l Conf. Acoustic Speech and Signal Processing (ICASSP02), IEEE Press, 2002, pp. 1941–1944.
5. A. Eronen, "Automatic Musical Instrument Recognition," master's thesis, Dept. of Information Technology, Tampere Univ. of Technology, 2001.
6. L. Ma, D. Smith, and B. Milner, "Context-Awareness Using Environmental Noise Classification," Proc. 8th European Conf. Speech Communication and Technology (Eurospeech 03), 2003, pp. 2237–2240.
7. D. Smith, L. Ma, and N. Ryan, "Acoustic Environment as an Indicator of Social and Physical Context," Personal Ubiquitous Computing, vol. 10, no. 4, 2006, pp. 241–254.
8. P. Korpipää et al., "Managing Context Information in Mobile Devices," IEEE Pervasive Computing, vol. 2, no. 3, 2003, pp. 42–51.
9. F. Bonnevier, "Audio Based Context-Awareness on a Pocket PC," master's thesis, Dept. of Electrical Eng., Stockholm Inst. of Technology, 2006.

Index Terms:
audio-signal processing, context awareness, context recognition, context reasoning, smart devices, smart systems.
Waltenegus Dargie, Tobias Tersch, "Recognition of Complex Settings by Aggregating Atomic Scenes," IEEE Intelligent Systems, vol. 23, no. 5, pp. 58-65, Sept.-Oct. 2008, doi:10.1109/MIS.2008.90
Usage of this product signifies your acceptance of the Terms of Use.