loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Second International Conference on Document Image Analysis for Libraries (DIAL'06)
What can we learn from the processing of 165,000 forms from the 19th century?
Lyon, France
April 27-April 28
ISBN: 0-7695-2531-8
Bertrand Couasnon, IRISA / INSA, Campus universitaire de Beaulieu, Rennes Cedex, France

This paper presents an assessment of the structure recognition of 165,000 pages of military forms from the 19th century. This recognition have been done with the DMOS method, a generic structure recognition method already applied on various kind of documents: musical scores, mathematical formulae, recursive table structures and archival documents.

With such an amount of documents, we have been confronted with the reality of difficulties found in ancient documents. We will present in this paper what we learned from this processing at a very large scale: in archival documents it is quite impossible to foresee difficulties we will have to deal with. Even with a large sample considered by archivist as representative, documents we had to deal with were much more damaged than anticipated. Even with strong and precise specifications on the way documents should be digitized, theses specifications were not followed at all, introducing new difficulties for the recognition phase. To overcome theses unexpected difficulties, the genericity of the DMOS method was particularly important.

Citation:
Bertrand Couasnon, "What can we learn from the processing of 165,000 forms from the 19th century?," dial, pp.172-179, Second International Conference on Document Image Analysis for Libraries (DIAL'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.