loading...
 This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
2008 The Eighth IAPR International Workshop on Document Analysis Systems
Keyword Matching in Historical Machine-Printed Documents Using Synthetic Data, Word Portions and Dynamic Time Warping
September 16-September 19
ISBN: 978-0-7695-3337-7
In this paper we propose a novel and efficient technique for finding keywords typed by the user in digitised Machine-printed historical documents using the Dynamic Time Warping (DTW) algorithm. The method uses word portions located at the beginning and end of each segmented word of the processed documents and try to estimate the position of the first and last characters in order to reduce the list of candidate words. Since DTW can become computational intensive in large datasets the proposed method manages to significantly prune the list of candidate words thus, speeding up the entire process. Word length is also used as a means of further reducing the data to be processed. Results are improved in terms of time and efficiency compared to those produced if no pruning is done to the list of candidate words.
Index Terms:
Historical Documents, Indexing, Dynamic Time Warping
Citation:
Thomas Konidaris, B. Gatos, S.J. Perantonis, A. Kesidis, "Keyword Matching in Historical Machine-Printed Documents Using Synthetic Data, Word Portions and Dynamic Time Warping," das, pp.539-545, 2008 The Eighth IAPR International Workshop on Document Analysis Systems, 2008
Usage of this product signifies your acceptance of the Terms of Use.