2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) (2017)
Nov. 9, 2017 to Nov. 15, 2017
Hundreds of millions of figures are available in the biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information and understanding biomedical documents. Unlike images in the open domain, biomedical figures present a variety of unique challenges. For example, biomedical figures typically have complex layouts, small font sizes, short text, specific text, complex symbols and irregular text arrangements. This paper presents the final results of the ICDAR 2017 Competition on Text Extraction from Biomedical Literature Figures (ICDAR2017 DeTEXT Competition), which aims at extracting (detecting and recognizing) text from biomedical literature figures. Similar to text extraction from scene images and web pictures, ICDAR2017 DeTEXT Competition includes three major tasks, i.e., text detection, cropped word recognition and end-to-end text recognition. Here, we describe in detail the data set, tasks, evaluation protocols and participants of this competition, and report the performance of the participating methods.
feature extraction, medical computing, optical character recognition, text detection
C. Yang, X. Yin, H. Yu, D. Karatzas and Y. Cao, "ICDAR2017 Robust Reading Challenge on Text Extraction from Biomedical Literature Figures (DeTEXT)," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 2018, pp. 1444-1447.