2017 IEEE International Conference on Computer Vision Workshop (ICCVW) (2017)
Oct. 22, 2017 to Oct. 29, 2017
In this work, we propose a novel method to learn the mapping to the common space wherein different modalities have the same information for shared representation learning. Our goal is to correctly classify the target modality with a classifier trained on source modality samples and their labels in common representations. We call these representations modality-invariant representations. Our proposed method has the major advantage of not needing any labels for the target samples in order to learn representations. For example, we obtain modality-invariant representations from pairs of images and texts. Then, we train the text classifier on the modality-invariant space. Although we do not give any explicit relationship between images and labels, we can expect that images can be classified correctly in that space. Our method draws upon the theory of domain adaptation and we propose to learn modality-invariant representations by utilizing adversarial training. We call our method the Deep Modality Invariant Adversarial Network (DeMIAN). We demonstrate the effectiveness of our method in experiments.
Training, Generators, Feature extraction, Gaussian distribution, Games, Conferences, Videos
T. Harada, K. Saito, Y. Mukuta and Y. Ushiku, "Deep Modality Invariant Adversarial Network for Shared Representation Learning," 2017 IEEE International Conference on Computer Vision Workshop (ICCVW), Venice, Italy, 2017, pp. 2623-2629.