Issue No. 01 - January (2010 vol. 32)
Sergio Escalera , Universitat de Barcelona and Universitate Autonoma de Barcelona, Barcelona
Oriol Pujol , Universitat de Barcelona and Universitate Autonoma de Barcelona, Barcelona
Petia Radeva , Universitat de Barcelona and Universitate Autonoma de Barcelona, Barcelona
A common way to model multiclass classification problems is to design a set of binary classifiers and to combine them. Error-Correcting Output Codes (ECOC) represent a successful framework to deal with these type of problems. Recent works in the ECOC framework showed significant performance improvements by means of new problem-dependent designs based on the ternary ECOC framework. The ternary framework contains a larger set of binary problems because of the use of a “do not care” symbol that allows us to ignore some classes by a given classifier. However, there are no proper studies that analyze the effect of the new symbol at the decoding step. In this paper, we present a taxonomy that embeds all binary and ternary ECOC decoding strategies into four groups. We show that the zero symbol introduces two kinds of biases that require redefinition of the decoding design. A new type of decoding measure is proposed, and two novel decoding strategies are defined. We evaluate the state-of-the-art coding and decoding strategies over a set of UCI Machine Learning Repository data sets and into a real traffic sign categorization problem. The experimental results show that, following the new decoding strategies, the performance of the ECOC design is significantly improved.
Error-correcting output codes, decoding, multiclass classification, embedding of dichotomizers.
O. Pujol, P. Radeva and S. Escalera, "On the Decoding Process in Ternary Error-Correcting Output Codes," in IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 32, no. , pp. 120-134, 2008.