<p><b>Abstract</b>—We describe an automated script identification system for typeset document images. Templates for each script are created by clustering textual symbols from a training set. Symbols from new images are compared to the templates to find the best script. Our current system processes thirteen scripts with minimal preprocessing and high accuracy.</p>
Script identification, document analysis, optical character recognition.
Patrick Kelly, Judith Hochberg, Timothy Thomas, Lila Kerns, "Automatic Script Identification From Document Images Using Cluster-Based Templates", IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 19, no. , pp. 176-181, February 1997, doi:10.1109/34.574802
