The Community for Technology Leaders
2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) (2017)
Kyoto, Japan
Nov. 9, 2017 to Nov. 15, 2017
ISSN: 2379-2140
ISBN: 978-1-5386-3586-5
pp: 1346-1351
This paper presents an approach for identifying reader specific difficult words while someone is reading a textual document. The work is motivated by the need of developing human-document interaction systems, in general and creating person-specific online educational content, in particular. Eye gaze information gives person specific behavior whereas textual content is analyzed to get general linguistic aspect of the document content. These two pieces of information are fused together through machine learning algorithms to identify the set of difficult words for a particular reader reading a particular document. An annotated dataset has been created where each word in a document is marked with its bounding box information and each reader identifies a set of difficult words while reading the document. The dataset consists of sixteen documents and each document is read by five subjects. The method is evaluated through recall-precision analysis. The impressive precision at high recall attests the feasibility of building a practical application based on this research. The experiment further brings out several interesting facts about human reading behaviour.
gaze tracking, human computer interaction, interactive systems, learning (artificial intelligence), linguistics, text analysis

U. Garain, O. Pandit, O. Augereau, A. Okoso and K. Kise, "Identification of Reader Specific Difficult Words by Analyzing Eye Gaze and Document Content," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 2018, pp. 1346-1351.
155 ms
(Ver 3.3 (11022016))