This Article 
 Bibliographic References 
 Add to: 
Fast Recognition of Musical Genres Using RBF Networks
April 2005 (vol. 17 no. 4)
pp. 580-584
This paper explores the automatic classification of audio tracks into musical genres. Our goal is to achieve human-level accuracy with fast training and classification. This goal is achieved with radial basis function (RBF) networks by using a combination of unsupervised and supervised initialization methods. These initialization methods yield classifiers that are as accurate as RBF networks trained with gradient descent (which is hundreds of times slower). In addition, feature subset selection further reduces training and classification time while preserving classification accuracy. Combined, our methods succeed in creating an RBF network that matches the musical classification accuracy of humans. The general algorithmic contribution of this paper is to show experimentally that RBF networks initialized with a combination of methods can yield good classification performance without relying on gradient descent. The simplicity and computational efficiency of our initialization methods produce classifiers that are fast to train as well as fast to apply to novel data. We also present an improved method for initializing the k{\hbox{-}}{\rm means} clustering algorithm which is useful for both unsupervised and supervised initialization methods.

[1] C. Bishop, Neural Networks for Pattern Recognition. Oxford Univ. Press, 1995.
[2] D. Broomhead and D. Lowe, “Multivariable Functional Interpolation and Adaptive Networks,” Complex Systems, vol. 2, pp. 321-355, 1988.
[3] R. Duda, P. Hart, and D. Stork, Pattern Classification, second ed. John Wiley & Sons, 2001.
[4] C. Elkan, “Using the Triangle Inequality to Accelerate k-Means,” Proc. 20th Int'l Conf. Machine Learning, pp. 147-153, 2003.
[5] G. Hamerly and C. Elkan, “Alternatives to the k-Means Algorithm that Find Better Clusterings,” Proc. 11th Int'l Conf. Information and Knowledge Management, pp. 600-607, 2002.
[6] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. Springer, 2001.
[7] D. Hochbaum and D. Shmoys, “A Best Possible Heuristic for the k-Center Problem,” Math. of Operations Research, vol. 10, no. 2, pp. 180-184, 1985.
[8] N. Karayiannis and G.M. Mi, “Growing Radial Basis Neural Networks: Merging Supervised and Unsupervised Learning with Network Growth Techniques,” IEEE Trans. Neural Networks, vol. 8, no. 6, pp. 1492-1506, 1997.
[9] J. MacQueen, “On Convergence of k-means and Partitions with Minimum Average Variance,” Annals of Math. Statistics, vol. 36, p. 1084, 1965.
[10] M. Meila and D. Heckerman, “An Experimental Comparison of Several Clustering and Initialization Methods,” Proc. 14th Ann. Conf. Uncertainty in Artificial Intelligence, pp. 386-395, 1998.
[11] J. Moody and C. Darken, “Fast Learning in Networks of Locally-Tuned Processing Units,” Neural Computation, vol. 1, no. 2, pp. 281-294, 1989.
[12] R. Motwani and P. Raghavan, Randomized Algorithms. Cambridge Univ. Press, 1995.
[13] K. Tieu and P. Viola, “Boosting Image Retrieval,” Proc. IEEE Conf. Computer Vision and Pattern Recognition, vol. 56, nos. 1-2, pp. 17-36, 2000.
[14] G. Tzanetakis and P. Cook, “Marsyas: A Framework for Audio Analysis,” Organised Sound, vol. 4, no. 30, pp. 169-175, 2000.
[15] G. Tzanetakis and P. Cook, “Musical Classification of Audio Signals,” IEEE Trans. Speech and Audio Processing, vol. 10, no. 5, pp. 293-302, 2002.
[16] G. Tzanetakis and T. Li, “Factors in Automatic Musical Classification of Audio Signals,” Proc. IEEE Workshop Applications of Signal Processing to Audio and Acoustics, 2003.
[17] S. Ullman, M. Vidal-Naquet, and E. Sali, “Visual Features of Intermediate Complexity and Their Use in Classification,” Nature Neuroscience, vol. 5, no. 7, pp. 682-687, 2002.

Index Terms:
Radial basis function network, musical genre, initialization method, feature subset selection.
Douglas Turnbull, Charles Elkan, "Fast Recognition of Musical Genres Using RBF Networks," IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 580-584, April 2005, doi:10.1109/TKDE.2005.62
Usage of this product signifies your acceptance of the Terms of Use.