The Community for Technology Leaders
2017 IEEE International Conference on Computer Vision (ICCV) (2017)
Venice, Italy
Oct. 22, 2017 to Oct. 29, 2017
ISSN: 2380-7504
ISBN: 978-1-5386-1032-9
pp: 3075-3084
ABSTRACT
We propose a novel deep learning approach to solve simultaneous alignment and recognition problems (referred to as “Sequence-to-sequence” learning). We decompose the problem into a series of specialised expert systems referred to as SubUNets. The spatio-temporal relationships between these SubUNets are then modelled to solve the task, while remaining trainable end-to-end. The approach mimics human learning and educational techniques, and has a number of significant advantages. SubUNets allow us to inject domain-specific expert knowledge into the system regarding suitable intermediate representations. They also allow us to implicitly perform transfer learning between different interrelated tasks, which also allows us to exploit a wider range of more varied data sources. In our experiments we demonstrate that each of these properties serves to significantly improve the performance of the overarching recognition system, by better constraining the learning problem. The proposed techniques are demonstrated in the challenging domain of sign language recognition. We demonstrate state-of-the-art performance on hand-shape recognition (outperforming previous techniques by more than 30%). Furthermore, we are able to obtain comparable sign recognition rates to previous research, without the need for an alignment step to segment out the signs for recognition.
INDEX TERMS
expert systems, image recognition, learning (artificial intelligence), shape recognition, sign language recognition
CITATION

N. C. Camgoz, S. Hadfield, O. Koller and R. Bowden, "SubUNets: End-to-End Hand Shape and Continuous Sign Language Recognition," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2018, pp. 3075-3084.
doi:10.1109/ICCV.2017.332
167 ms
(Ver 3.3 (11022016))