Issue No. 03 - March (2014 vol. 26)
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2013.20
Dominik Fisch , University of Kassel, Kassel
Edgar Kalkowski , University of Kassel, Kassel
Bernhard Sick , University of Kassel, Kassel
If knowledge such as classification rules are extracted from sample data in a distributed way, it may be necessary to combine or fuse these rules. In a conventional approach this would typically be done either by combining the classifiers' outputs (e.g., in form of a classifier ensemble) or by combining the sets of classification rules (e.g., by weighting them individually). In this paper, we introduce a new way of fusing classifiers at the level of parameters of classification rules. This technique is based on the use of probabilistic generative classifiers using multinomial distributions for categorical input dimensions and multivariate normal distributions for the continuous ones. That means, we have distributions such as Dirichlet or normal-Wishart distributions over parameters of the classifier. We refer to these distributions as hyperdistributions or second-order distributions. We show that fusing two (or more) classifiers can be done by multiplying the hyperdistributions of the parameters and derive simple formulas for that task. Properties of this new approach are demonstrated with a few experiments. The main advantage of this fusion approach is that the hyperdistributions are retained throughout the fusion process. Thus, the fused components may, for example, be used in subsequent training steps (online training).
Probabilistic logic, Bayesian methods, Covariance matrix, Knowledge engineering, Training, Data mining, Coordinate measuring machines
Dominik Fisch, Edgar Kalkowski, Bernhard Sick, "Knowledge Fusion for Probabilistic Generative Classifiers with Data Mining Applications", IEEE Transactions on Knowledge & Data Engineering, vol. 26, no. , pp. 652-666, March 2014, doi:10.1109/TKDE.2013.20