This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
On Classification with Incomplete Data
March 2007 (vol. 29 no. 3)
pp. 427-436
We address the incomplete-data problem in which feature vectors to be classified are missing data (features). A (supervised) logistic regression algorithm for the classification of incomplete data is developed. Single or multiple imputation for the missing data is avoided by performing analytic integration with an estimated conditional density function (conditioned on the observed data). Conditional density functions are estimated using a Gaussian mixture model (GMM), with parameter estimation performed using both Expectation-Maximization (EM) and Variational Bayesian EM (VB-EM). The proposed supervised algorithm is then extended to the semisupervised case by incorporating graph-based regularization. The semisupervised algorithm utilizes all available data—both incomplete and complete, as well as labeled and unlabeled. Experimental results of the proposed classification algorithms are shown.
Index Terms:
Classification, incomplete data, missing data, supervised learning, semisupervised learning, imperfect labeling.
Citation:
David Williams, Xuejun Liao, Ya Xue, Lawrence Carin, Balaji Krishnapuram, "On Classification with Incomplete Data," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 3, pp. 427-436, March 2007, doi:10.1109/TPAMI.2007.52
Usage of this product signifies your acceptance of the Terms of Use.