Multimodal Interfaces, IEEE International Conference on (2002)
Oct. 14, 2002 to Oct. 16, 2002
Ali Zandifar , University of Maryland at College Park
Ramani Duraiswami , University of Maryland at College Park
Antoine Chahine , University of Maryland at College Park
Larry S. Davis , University of Maryland at College Park
We describe the development of an interface to textual information for the visually impaired that uses video, image processing, optical-character-recognition (OCR) and text-to-speech (TTS). The video provides a sequence of low resolution images in which text must be detected, rectified and converted into high resolution rectangular blocks that are capable of being analyzed via off-the-shelf OCR. To achieve this, various problems related to feature detection, mosaicing, auto-focus, zoom, and systems integration were solved in the development of the system, and these are described.
R. Duraiswami, A. Chahine, L. S. Davis and A. Zandifar, "A Video Based Interface to Textual Information for the Visually Impaired," Multimodal Interfaces, IEEE International Conference on(ICMI), Pittsburgh, Pennsylvania, 2002, pp. 325.