This Article 
 Bibliographic References 
 Add to: 
On Fusers that Perform Better than Best Sensor
August 2001 (vol. 23 no. 8)
pp. 904-909

Abstract—In a multiple sensor system, sensor $S_i$, $i=1, 2 \ldots , N$, outputs $Y^{(i)}\in [0,1]$, according to an unknown probability distribution $P_{Y^{(i)} | X }$, in response to input $X \in [0,1]$. We choose a fuser—that combines the outputs of sensors—from a function class ${\cal{F}} = \{ f : [0,1]^N \mapsto [0,1] \}$ by minimizing empirical error based on an iid sample. If $\cal{F}$ satisfies the isolation property, we show that the fuser performs at least as well as the best sensor in a probably approximately correct sense. Several well-known fusers, such as linear combinations, special potential functions, and certain feedforward networks, satisfy the isolation property.

[1] Data Fusion in Robotics and Machine Intelligence, M.A. Abidi and R.C. Gonzalez, eds. New York: Academic Press, 1992.
[2] M.A. Aizerman, E.M. Braverman, and L.I. Rozonoer, “Extrapolative Problems in Automatic Control and Method of Potential Functions,” Am. Math. Soc. Translations, vol. 87, pp. 281-303, 1970.
[3] K.M. Ali and M.J. Pazzani, “Error Reduction through Learning Multiple Descriptions,” Machine Learning, vol. 24, no. 3, pp. 173-202, 1996.
[4] P. Billingsley, Probability and Measure, second ed. New York: John Wiley&Sons, 1986.
[5] L. Breiman, “Bagging Predictors,” Machine Learning, vol. 24, pp. 123-140, 1996.
[6] L. Breiman, “Stacked Regressions,” Machine Learning, vol. 24, pp. 49-64, 1996.
[7] L. Breiman, “Arcing Classifiers,” Annals of Statistics, vol. 26, no. 3, pp. 801-849, 1998.
[8] N. Cesa-Bianchi, Y. Fruend, D.P. Helmbold, D. Haussler, R.E. Schipire, and M.K. Warmuth, “How to Use Expert Advice,” J. ACM, pp. 382-391, 1995.
[9] L. Devroye, L. Gyorfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition. New York: Springer-Verlag, 1996.
[10] E.F. Gad, A.F. Atiya, S. Shaheen, and A. El-Dessouki, “A New Algorithm for Learning in Piecewise-Linear Neural Networks,” Neural Networks, vol. 13, pp. 485-505, 2000.
[11] S. Hashem, Optimal Linear Combination of Neural Networks Neural Networks, vol. 19, pp. 599-614, 1997.
[12] S. Hashem, Optimal Linear Combination of Neural Networks Neural Networks, vol. 19, pp. 599-614, 1997.
[13] T.K. Ho, J.J. Hull, and S.N. Srihari, “Decision Combination in Multiple Classifiers Systems,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 1, pp. 66-75, Jan. 1994.
[14] M.I. Jordan and R.A. Jacobs, “Hierarchical Mixtures of Experts and the EM Algorithm,” Neural Computation, vol. 6, pp. 181-214, 1994.
[15] J. Kittler, M. Hatef, R. Duin, and J. Matas, “On Combining Classifiers,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 226-239, Mar. 1998.
[16] A. Krzyzak, T. Linder, and G. Lugosi, “Nonparametric Estimation and Classification Using Radial Basis Function Nets and Empirical Risk Minimization,” IEEE Trans. Neural Networks, vol. 7, no. 2, pp. 475-487, 1996.
[17] M. LeBlanc and R. Tibshirani, “Combining Estimates in Regression and Classification,” J. Am. Statistical Assoc., vol. 91, no. 436, pp. 1641-1650, 1996.
[18] G. Lugosi and K. Zeger, “Nonparametric Estimation via Empirical Risk Minimization,” IEEE Trans. Information Theory, vol. 41, no. 3, pp. 677-687, 1995.
[19] R.N. Madan and N.S.V. Rao, “Guest Editorial on Information/Decision Fusion with Engineering Applications,” J. Franklin Inst., vol. 336B, no. 2, pp. 199-204, 1999.
[20] W. Mass, “Agnostic PAC Learning of Functions on Analog Neural Nets,” Neural Computing, vol. 7, pp. 1054-1078, 1995.
[21] C.J. Merz and M.J. Pazzani, “A Principal Component Approach to Combining Regression Estimators,” Machine Leaning, vol. 36, pp. 9-32, 1997.
[22] M. Mojirsheibani, “A Consistent Combined Classification Rule,” Statistics and Probability Letters, vol. 36, pp. 43-47, 1997.
[23] M. Perrone and L.N. Cooper, “When Networks Disagree: Ensemble Methods for Hybrid Neural Networks,” Neural Networks for Speech and Image Processing, R.J. Mammone, ed., Chapman Hall, 1993.
[24] D. Pollard, Convergence of Stochastic Processes. New York: Springer-Verlag, 1984.
[25] N.S.V. Rao, “To Fuse or Not to Fuse: Fuser versus Best Classifier,” Proc. SPIE Conf. Sensor Fusion: Architectures, Algorithms, and Applications II, pp. 25-34, 1998.
[26] N.S.V. Rao, “Multiple Sensor Fusion under Unknown Distributions,” J. Franklin Inst., vol. 336, no. 2, pp. 285-299, 1999.
[27] N.S.V. Rao, “Finite Sample Performance Guarantees of Fusers for Function Estimators,” Information Fusion, vol. 1, no. 1, pp. 35-44, 2000.
[28] N.S.V. Rao, “Multisensor Fusion under Unknown Distributions: Finite Sample Performance Guarantees,” Multisensor Fusion, A.K. Hyder, ed., Kluwer Academic, 2000.
[29] N.S.V. Rao, E.M. Oblow, C.W. Glover, and G.E. Liepins, “N-Learners Problem: Fusion of Concepts,” IEEE Trans. Systems, Man, and Cybernetics, vol. 24, no. 2, pp. 319-327, 1994.
[30] N.S.V. Rao and V. Protopopescu, “Function Estimation by Feedforward Sigmoidal Networks with Bounded Weights,” Neural Processing Letters, vol. 7, pp. 125-131, 1998.
[31] N.S.V. Rao, V. Protopopescu, R.C. Mann, E.M. Oblow, and S.S. Iyengar, “Learning Algorithms for Feedforward Networks Based on Finite Samples,” IEEE Trans. Neural Networks, vol. 7, no. 4, pp. 926-940, 1996.
[32] Theoretical Advances in Neural Computation and Learning, V. Roychowdhury, K. Siuand, A. Orlitsky, eds. Boston: Kluwer Academic, 1994.
[33] C. Schaffer, “A Conservation Law for Generalization Performance,” Proc. 11th Int'l Conf. Machine Learning, pp. 259-265, 1994.
[34] R.E. Schapire, “The Strength of Weak Learnability,” Machine Learning, vol. 5, no. 2, pp. 197-227, 1990.
[35] R.E. Schapire, Y. Freund, P. Bartlett, and W.S. Lee, “Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods,” Proc. 14th Int'l Conf. Machine Learning, 1997.
[36] A.J.C. Sharkey, “On Combining Artificial Neural Nets,” Connection Science, vol. 8, no. 3, pp. 299-314, 1996.
[37] A.J.C. Sharkey, “Modularity, Combining and Artificial Neural Nets,” Connection Science, vol. 9, no. 1, pp. 3-10, 1997.
[38] M. Taniguchi and V. Tresp, “Averaging Regularized Estimators,” Neural Computation, vol. 9, pp. 1163-1178, 1997.
[39] K. Tumer and J. Ghosh, “Error Correlation and Error Reduction in Ensemble Classifiers,” Connection Science, vol. 8, no. 3, pp. 385-404, 1996.
[40] N. Ueda, “Optimal Linear Combination of Neural Networks for Improving Classification Performance,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 2, pp. 207-214, Feb. 2000.
[41] L.G. Valiant, “A Theory of the Learnable,” Comm. ACM, vol. 27, no. 11, pp. 1134-1142, Nov. 1984.
[42] V. Vapnik, Estimation of Dependences Based on Empirical Data. New York: Springer-Verlag, 1982.
[43] P.K. Varshney, Distributed Detection and Data Fusion. Springer-Verlag, 1997.
[44] D. Wolpert, "Stacked Generalization," Neural Networks, Vol. 5, 1992, pp. 241-259.
[45] D.H. Wolpert, “The Lack of A Priori Distinctions between Learning Algorithms,” Neural Computation, vol. 8, pp. 1341-1390, 1996.
[46] K. Woods, W.P. Kegelmeyer, and K.W. Bowyer, "Combination of Multiple Classifiers Using Local Accuracy Estimates," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 4, pp. 405-410, Apr. 1997.

Index Terms:
Sensor fusion, multiple sensor system, information fusion, fusion rule estimation.
Nageswara S.V. Rao, "On Fusers that Perform Better than Best Sensor," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 8, pp. 904-909, Aug. 2001, doi:10.1109/34.946993
Usage of this product signifies your acceptance of the Terms of Use.