Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2
A Low-Cost Parallel K-Means VQ Algorithm Using Cluster Computing
Edinburgh, Scotland
August 03-August 06
ISBN: 0-7695-1960-1
Robert Sabourin, ?cole de Technologie Sup?rieure; Centre for Pattern Recognition and Machine Intelligence
In this paper we propose a parallel approach for the K-means Vector Quantization (VQ) algorithm used in a two-stage Hidden Markov Model (HMM)-based system for recognizing handwritten numeral strings. With this parallel algorithm, based on the master/slave paradigm, we overcome two drawbacks of the sequential version: a) the time taken to create the codebook; and b) the amount of memory necessary to work with large training databases. Distributing the training samples over the slaves? local disks reduces the overhead associated with the communication process. In addition, models predicting computation and communication time have been developed. These models are useful to predict the optimal number of slaves taking into account the number of training samples and codebook size.
Citation:
Alceu de S. Britto Jr, Paulo S. L. de Souza, Robert Sabourin, Simone R. S. de Souza, D?bio L. Borges, "A Low-Cost Parallel K-Means VQ Algorithm Using Cluster Computing," icdar, vol. 2, pp.839, Seventh International Conference on Document Analysis and Recognition (ICDAR'03) - Volume 2, 2003