This Article 
 Bibliographic References 
 Add to: 
A test to determine the multivariate normality of a data set
September 1988 (vol. 10 no. 5)
pp. 757,758,759,760,761
A test is described for multivariate normality that is useful in pattern recognition. The test is based on the Friedman-Rafsky (1979) multivariate extension of the Wald-Wolfowitz runs test. The test data are combined with a multivariate swarm of points following the normal distribution generated with mean vector and covariance matrix estimated from the test data. The minimal spanning tree of this resultant ensemble of points is computed and the count of the interpopulation edges in the minimal spanning tree is used as a test statistic. The simulation studied both the null case of the test and one simple deviation from normality. Two conclusions are made from this study. First, the test can be conservatively applied by using the asymptotic normality of the test statistic, even for small sample sizes. Second, the power of the test appears reasonable, especially in high dimensions. Monte Carlo experiments were performed to determine if the test is reliable in high dimensions with moderate sample size. The method is compared to other such tests available in the literature.<>

[1] D. R. Cox and N. J. H. Small, "Testing multivariate normality,"Biometrika, vol. 65, pp. 263-272, 1978.
[2] G. R. Cross, N. C. Wyse, and A. K. Jain, "Multivariate normality in pattern recognition and clustering," inProc. 6th Int. Conf. Pattern Recognition, 1982, pp. 862-864.
[3] R. C. Dubes and A. K. Jain, "Clustering methodologies in exploratory data analysis,"Advances in Computers, M. Yovits, Ed. New York: Academic, 1980, pp. 113-228.
[4] J. H. Friedman and L. C. Rafsky, "Multivariate generalization of the Wald-Wolfowitz and Smirnov two-sample tests,"Ann. Statist., vol. 7, pp. 697-717, 1979.
[5] K. Fukunaga and T. E. Flick, "A test of the gaussian-ness of a data set using clustering,"IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-8, no. 2, pp. 240-247, 1986.
[6] R. Gnandesikan,Methods for Statistical Data Analysis of Multivariate Observations. New York: Wiley, 1977.
[7] J. A. Koziol, "On assessing multivariate normality,"J. Roy. Statist. Soc. B, vol. 45, no. 3, pp. 358-361, 1983.
[8] E. Lesaffre, "Normality tests and transformations,"Pattern Recognition Lett., vol. 1, pp. 187-199, 1983.
[9] K. V. Mardia, "Tests of univariate and multivariate normality," inHandbook of Statistics, P. R. Krishnaiah, Ed. Amsterdam, The Netherlands: North-Holland, 1980, pp. 279-320.
[10] R. C. Prim, "Shortest connection networks and some generalizations,"Bell Syst. Tech. J., vol. 36, pp. 1389-1401, 1957.
[11] S. P. Smith, "Structure of multidimensional patterns," Ph.D. dissertation, Dep. Comput. Sci., Michigan State Univ., 1982.
[12] S. P. Smith and A. K. Jain, "An experiment on using the Friedman-Rafsky test to determine the multivariate normality of a data set," inProc. Conf. Comput. Vision and Pattern Recognition, 1985, pp. 423- 425.
[13] S. P. Smith and A. K. Jain, "Testing for uniformity in multidimensional data,"IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-6, pp. 73-81, 1984.

Index Terms:
trees (mathematics),interpolation,Monte Carlo methods,pattern recognition,test statistic,Monte Carlo method,multivariate normality,data set,pattern recognition,Wald-Wolfowitz,spanning tree,interpopulation,Testing,Pattern recognition,Performance evaluation,Joining processes,Statistics,Books,Research and development,Automation,Laboratories,Computer science
"A test to determine the multivariate normality of a data set," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 10, no. 5, pp. 757,758,759,760,761, Sept. 1988, doi:10.1109/34.6789
Usage of this product signifies your acceptance of the Terms of Use.