The Community for Technology Leaders
Green Image
Issue No. 11 - Nov. (2017 vol. 29)
ISSN: 1041-4347
pp: 2484-2497
Varun Mithal , University of Minnesota, Minneapolis, MN
Guruprasad Nayak , University of Minnesota, Minneapolis, MN
Ankush Khandelwal , University of Minnesota, Minneapolis, MN
Vipin Kumar , University of Minnesota, Minneapolis, MN
Nikunj C. Oza , NASA Ames Center, Mountain View, CA
Ramakrishna Nemani , NASA Ames Center, Mountain View, CA
ABSTRACT
Many real-world problems involve learning models for rare classes in situations where there are no gold standard labels for training samples but imperfect labels are available for all instances. In this paper, we present RAPT, a three step predictive modeling framework for classifying rare class in such problem settings. The first step of the proposed framework learns a classifier that jointly optimizes precision and recall by only using imperfectly labeled training samples. We also show that, under certain assumptions on the imperfect labels, the quality of this classifier is almost as good as the one constructed using perfect labels. The second and third steps of the framework make use of the fact that imperfect labels are available for all instances to further improve the precision and recall of the rare class. We evaluate the RAPT framework on two real-world applications of mapping forest fires and urban extent from earth observing satellite data. The experimental results indicate that RAPT can be used to identify forest fires and urban areas with high precision and recall by using imperfect labels, even though obtaining expert annotated samples on a global scale is infeasible in these applications.
INDEX TERMS
Training, Satellites, Predictive models, Earth, Buildings, Training data, Urban areas
CITATION

V. Mithal, G. Nayak, A. Khandelwal, V. Kumar, N. C. Oza and R. Nemani, "RAPT: Rare Class Prediction in Absence of True Labels," in IEEE Transactions on Knowledge & Data Engineering, vol. 29, no. 11, pp. 2484-2497, 2017.
doi:10.1109/TKDE.2017.2739739
502 ms
(Ver 3.3 (11022016))