This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
On Fusers that Perform Better than Best Sensor
August 2001 (vol. 23 no. 8)
pp. 904-909

Abstract—In a multiple sensor system, sensor $S_i$, $i=1, 2 \ldots , N$, outputs $Y^{(i)}\in [0,1]$, according to an unknown probability distribution $P_{Y^{(i)} | X }$, in response to input $X \in [0,1]$. We choose a fuser—that combines the outputs of sensors—from a function class ${\cal{F}} = \{ f : [0,1]^N \mapsto [0,1] \}$ by minimizing empirical error based on an iid sample. If $\cal{F}$ satisfies the isolation property, we show that the fuser performs at least as well as the best sensor in a probably approximately correct sense. Several well-known fusers, such as linear combinations, special potential functions, and certain feedforward networks, satisfy the isolation property.

[1] Data Fusion in Robotics and Machine Intelligence, M.A. Abidi and R.C. Gonzalez, eds. New York: Academic Press, 1992.
[2] M.A. Aizerman, E.M. Braverman, and L.I. Rozonoer, “Extrapolative Problems in Automatic Control and Method of Potential Functions,” Am. Math. Soc. Translations, vol. 87, pp. 281-303, 1970.
[3] K.M. Ali and M.J. Pazzani, “Error Reduction through Learning Multiple Descriptions,” Machine Learning, vol. 24, no. 3, pp. 173-202, 1996.
[4] P. Billingsley, Probability and Measure, second ed. New York: John Wiley&Sons, 1986.
[5] L. Breiman, “Bagging Predictors,” Machine Learning, vol. 24, pp. 123-140, 1996.
[6] L. Breiman, “Stacked Regressions,” Machine Learning, vol. 24, pp. 49-64, 1996.
[7] L. Breiman, “Arcing Classifiers,” Annals of Statistics, vol. 26, no. 3, pp. 801-849, 1998.
[8] N. Cesa-Bianchi, Y. Fruend, D.P. Helmbold, D. Haussler, R.E. Schipire, and M.K. Warmuth, “How to Use Expert Advice,” J. ACM, pp. 382-391, 1995.
[9] L. Devroye, L. Gyorfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition. New York: Springer-Verlag, 1996.
[10] E.F. Gad, A.F. Atiya, S. Shaheen, and A. El-Dessouki, “A New Algorithm for Learning in Piecewise-Linear Neural Networks,” Neural Networks, vol. 13, pp. 485-505, 2000.
[11] S. Hashem, Optimal Linear Combination of Neural Networks Neural Networks, vol. 19, pp. 599-614, 1997.
[12] S. Hashem, Optimal Linear Combination of Neural Networks Neural Networks, vol. 19, pp. 599-614, 1997.
[13] T.K. Ho, J.J. Hull, and S.N. Srihari, “Decision Combination in Multiple Classifiers Systems,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 1, pp. 66-75, Jan. 1994.
[14] M.I. Jordan and R.A. Jacobs, “Hierarchical Mixtures of Experts and the EM Algorithm,” Neural Computation, vol. 6, pp. 181-214, 1994.
[15] J. Kittler, M. Hatef, R. Duin, and J. Matas, “On Combining Classifiers,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 226-239, Mar. 1998.
[16] A. Krzyzak, T. Linder, and G. Lugosi, “Nonparametric Estimation and Classification Using Radial Basis Function Nets and Empirical Risk Minimization,” IEEE Trans. Neural Networks, vol. 7, no. 2, pp. 475-487, 1996.
[17] M. LeBlanc and R. Tibshirani, “Combining Estimates in Regression and Classification,” J. Am. Statistical Assoc., vol. 91, no. 436, pp. 1641-1650, 1996.
[18] G. Lugosi and K. Zeger, “Nonparametric Estimation via Empirical Risk Minimization,” IEEE Trans. Information Theory, vol. 41, no. 3, pp. 677-687, 1995.
[19] R.N. Madan and N.S.V. Rao, “Guest Editorial on Information/Decision Fusion with Engineering Applications,” J. Franklin Inst., vol. 336B, no. 2, pp. 199-204, 1999.
[20] W. Mass, “Agnostic PAC Learning of Functions on Analog Neural Nets,” Neural Computing, vol. 7, pp. 1054-1078, 1995.
[21] C.J. Merz and M.J. Pazzani, “A Principal Component Approach to Combining Regression Estimators,” Machine Leaning, vol. 36, pp. 9-32, 1997.
[22] M. Mojirsheibani, “A Consistent Combined Classification Rule,” Statistics and Probability Letters, vol. 36, pp. 43-47, 1997.
[23] M. Perrone and L.N. Cooper, “When Networks Disagree: Ensemble Methods for Hybrid Neural Networks,” Neural Networks for Speech and Image Processing, R.J. Mammone, ed., Chapman Hall, 1993.
[24] D. Pollard, Convergence of Stochastic Processes. New York: Springer-Verlag, 1984.
[25] N.S.V. Rao, “To Fuse or Not to Fuse: Fuser versus Best Classifier,” Proc. SPIE Conf. Sensor Fusion: Architectures, Algorithms, and Applications II, pp. 25-34, 1998.
[26] N.S.V. Rao, “Multiple Sensor Fusion under Unknown Distributions,” J. Franklin Inst., vol. 336, no. 2, pp. 285-299, 1999.
[27] N.S.V. Rao, “Finite Sample Performance Guarantees of Fusers for Function Estimators,” Information Fusion, vol. 1, no. 1, pp. 35-44, 2000.
[28] N.S.V. Rao, “Multisensor Fusion under Unknown Distributions: Finite Sample Performance Guarantees,” Multisensor Fusion, A.K. Hyder, ed., Kluwer Academic, 2000.
[29] N.S.V. Rao, E.M. Oblow, C.W. Glover, and G.E. Liepins, “N-Learners Problem: Fusion of Concepts,” IEEE Trans. Systems, Man, and Cybernetics, vol. 24, no. 2, pp. 319-327, 1994.
[30] N.S.V. Rao and V. Protopopescu, “Function Estimation by Feedforward Sigmoidal Networks with Bounded Weights,” Neural Processing Letters, vol. 7, pp. 125-131, 1998.
[31] N.S.V. Rao, V. Protopopescu, R.C. Mann, E.M. Oblow, and S.S. Iyengar, “Learning Algorithms for Feedforward Networks Based on Finite Samples,” IEEE Trans. Neural Networks, vol. 7, no. 4, pp. 926-940, 1996.
[32] Theoretical Advances in Neural Computation and Learning, V. Roychowdhury, K. Siuand, A. Orlitsky, eds. Boston: Kluwer Academic, 1994.
[33] C. Schaffer, “A Conservation Law for Generalization Performance,” Proc. 11th Int'l Conf. Machine Learning, pp. 259-265, 1994.
[34] R.E. Schapire, “The Strength of Weak Learnability,” Machine Learning, vol. 5, no. 2, pp. 197-227, 1990.
[35] R.E. Schapire, Y. Freund, P. Bartlett, and W.S. Lee, “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods,” Proc. 14th Int'l Conf. Machine Learning, 1997.
[36] A.J.C. Sharkey, “On Combining Artificial Neural Nets,” Connection Science, vol. 8, no. 3, pp. 299-314, 1996.
[37] A.J.C. Sharkey, “Modularity, Combining and Artificial Neural Nets,” Connection Science, vol. 9, no. 1, pp. 3-10, 1997.
[38] M. Taniguchi and V. Tresp, “Averaging Regularized Estimators,” Neural Computation, vol. 9, pp. 1163-1178, 1997.
[39] K. Tumer and J. Ghosh, “Error Correlation and Error Reduction in Ensemble Classifiers,” Connection Science, vol. 8, no. 3, pp. 385-404, 1996.
[40] N. Ueda, “Optimal Linear Combination of Neural Networks for Improving Classification Performance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 2, pp. 207-214, Feb. 2000.
[41] L.G. Valiant, “A Theory of the Learnable,” Comm. ACM, vol. 27, no. 11, pp. 1134-1142, Nov. 1984.
[42] V. Vapnik, Estimation of Dependences Based on Empirical Data. New York: Springer-Verlag, 1982.
[43] P.K. Varshney, Distributed Detection and Data Fusion. Springer-Verlag, 1997.
[44] D. Wolpert, "Stacked Generalization," Neural Networks, Vol. 5, 1992, pp. 241-259.
[45] D.H. Wolpert, “The Lack of A Priori Distinctions between Learning Algorithms,” Neural Computation, vol. 8, pp. 1341-1390, 1996.
[46] K. Woods, W.P. Kegelmeyer, and K.W. Bowyer, "Combination of Multiple Classifiers Using Local Accuracy Estimates," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 4, pp. 405-410, Apr. 1997.

Index Terms:
Sensor fusion, multiple sensor system, information fusion, fusion rule estimation.
Citation:
Nageswara S.V. Rao, "On Fusers that Perform Better than Best Sensor," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 8, pp. 904-909, Aug. 2001, doi:10.1109/34.946993
Usage of this product signifies your acceptance of the Terms of Use.