2018 International Conference on 3D Vision (3DV) (2018)
Sep 5, 2018 to Sep 8, 2018
User interaction provides useful information for solving challenging computer vision problems in practice. In this paper, we show that a very limited number of user clicks could greatly boost monocular depth estimation performance and overcome monocular ambiguities. We formulate this task as a deep structured model, in which the structured pixel-wise depth estimation has ordinal constraints introduced by user clicks. We show that the inference of the proposed model could be efficiently solved through a feed-forward network. We demonstrate the effectiveness of the proposed model on NYU Depth V2 and Stanford 2D-3D datasets. On both datasets, we achieve state-of-the-art performance when encoding user interaction into our deep models.
computer vision, feedforward neural nets, image resolution, inference mechanisms, user interfaces
D. Ron et al., "Monocular Depth Estimation via Deep Structured Models with Ordinal Constraints," 2018 International Conference on 3D Vision (3DV), Verona, Italy, 2018, pp. 570-577.