This Article 
 Bibliographic References 
 Add to: 
Fast Bundle Algorithm for Multiple-Instance Learning
June 2012 (vol. 34 no. 6)
pp. 1068-1079
J. Zaretzki, Dept. of Chem. & Chem. Biol., Rensselaer Polytech. Inst., Troy, NY, USA
G. Moore, Dept. of Math. Sci., Rensselaer Polytech. Inst., Troy, NY, USA
C. Bergeron, Depts. of Math. Sci. & Electr., Syst., & Comput. Eng., Rensselaer Polytech. Inst., Troy, NY, USA
C. M. Breneman, Dept. of Chem. & Chem. Biol., Rensselaer Polytech. Inst., Troy, NY, USA
K. P. Bennett, Depts. of Math. Sci. & Comput. Sci., Rensselaer Polytech. Inst., Troy, NY, USA
We present a bundle algorithm for multiple-instance classification and ranking. These frameworks yield improved models on many problems possessing special structure. Multiple-instance loss functions are typically nonsmooth and nonconvex, and current algorithms convert these to smooth nonconvex optimization problems that are solved iteratively. Inspired by the latest linear-time subgradient-based methods for support vector machines, we optimize the objective directly using a nonconvex bundle method. Computational results show this method is linearly scalable, while not sacrificing generalization accuracy, permitting modeling on new and larger data sets in computational chemistry and other applications. This new implementation facilitates modeling with kernels.

[1] T.G. Dietterich, R.H. Lathrop, and T. Lozano-Perez, "Solving the Multiple Instance Problem with Axis-Parallel Rectangles," Artificial Intelligence, vol. 89, nos. 1/2, pp. 31-71, 1997.
[2] S. Andrews, I. Tsochantaridis, and T. Hofmann, "Support Vector Machines for Multiple-Instance Learning," Proc. Advances in Neural Information Processing Systems, vol. 15, 2003.
[3] M.M. Dundar, G. Fung, B. Krishnapuram, and R.B. Rao, "Multiple-Instance Learning Algorithms for Computer-Aided Detection," IEEE Trans. Biomedical Eng., vol. 55, no. 3, pp. 1015-1021, Mar. 2008.
[4] J.F. Murray, G.F. Hughes, and K. Kreutz-Delgado, "Machine Learning Methods for Predicting Failures in Hard Drives: A Multiple-Instance Application," J. Machine Learning Research, vol. 6, pp. 783-816, 2005.
[5] O.L. Mangasarian and E.W. Wild, "Multiple Instance Classification via Successive Linear Programming," J. Optimization Theory and Applications, vol. 137, no. 3, pp. 555-568, 2008.
[6] J. Wang and J. Zucker, "Solving the Multiple-Instance Problem: A Lazy Learning Approach," Proc. Int'l Conf. Machine Learning, vol. 17, pp. 1119-1125, 2000.
[7] T. Gärtner, P. Flach, A. Kowalczyk, and A. Smola, "Multi-Instance Kernels," Proc. 19th Int'l Conf. Machine Learning, vol. 19, pp. 179-186, 2002.
[8] N. Weidmann, E. Frank, and B. Pfahringer, "A Two-Level Learning Method for Generalized Multi-Instance Problems," Proc. European Conf. Machine Learning, pp. 468-479, 2003.
[9] P. Auer and R. Ortner, "A Boosting Approach to Multiple Instance Learning," Proc. European Conf. Machine Learning, vol. 15, pp. 63-74, 2004.
[10] Y. Chen and J. Wang, "Image Categorization by Learning and Reasoning with Regions," J. Machine Learning Research, vol. 5, pp. 913-939, 2004.
[11] H. Blockeel, D. Page, and A. Srinivasan, "Multi-Instance Tree Learning," Proc. 22nd Int'l Conf. Machine Learning, vol. 22, pp. 144-152, 2005.
[12] Q. Tao, S. Scott, N.V. Vinodchandran, T. Osugi, and B. Mueller, "Kernels for Generalized Multiple-Instance Learning," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 30, no. 12, pp. 2084-2097, Dec. 2008.
[13] O. Maron and A.L. Ratan, "Multiple-Instance Learning for Natural Scene Classification," Proc. 15th Int'l Conf. Machine Learning, vol. 15, 1998.
[14] Q. Zhang and S.A. Goldman, "EM-DD: An Improved Multiple-Instance Learning Technique," Advances in Neural Information Processing Systems, vol. 14, pp. 1073-1080, 2001.
[15] J. Ramon and L.D. Raedt, "Multi Instance Neural Networks," Proc. 17th Int'l Machine Learning Conf., vol. 17, 2000.
[16] S. Ray and D. Page, "Multiple Instance Regression," Proc. Int'l Conf. Machine Learning, vol. 18, pp. 425-432, 2001.
[17] D. Dooly, Q. Zhang, S. Goldman, and R. Amar, "Multiple-Instance Learning of Real-Valued Data," J. Machine Learning Research, vol. 3, pp. 651-678, 2002.
[18] O. Wu, J. Gao, W. Hu, B. Li, and M. Zhu, "Indentifying Multi-Instance Outliers," Proc. SIAM Int'l Conf. Data Mining, vol. 14, pp. 430-441, 2010.
[19] Y. Hu, M. Li, and N. Yu, "Multiple Instance Ranking: Learning to Rank Images for Image Retrieval," Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2008.
[20] C. Bergeron, J. Zaretzki, C. Breneman, and K.P. Bennett, "Multiple Instance Ranking," Proc. 25th Int'l Conf. Machine Learning, pp. 48-55, 2008.
[21] T. Joachims, "Training Linear SVMs in Linear Time," Proc. 12th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining, pp. 217-226, 2006.
[22] C.H. Teo, Q.V. Le, A. Smola, and S.V.N. Vishwanathan, "A Scalable Modular Convex Solver for Regularized Risk Minimization," Proc. 13th ACM Conf. Knowledge Discovery and Data Mining, pp. 727-736, 2007.
[23] A. Fuduli, M. Gaudioso, and G. Giallombardo, "Minimizing Nonconvex Nonsmooth Functions via Cutting Planes and Proximity Control," SIAM J. Optimization, vol. 14, no. 3, pp. 743-756, 2004.
[24] A. Astorino and A. Fuduli, "Nonsmooth Optimization Techniques for Semisupervised Classification," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 2135-2142, Dec. 2007.
[25] F.P. Guengerich, "Cytochrome p450 and Chemical Toxicology," Chemical Research in Toxicology, vol. 21, no. 1, pp. 70-83, 2008.
[26] F.H. Clarke, Optimization and Nonsmooth Analysis. Wiley, 1983.
[27] A. Ruszczyński, Nonlinear Optimization. Princeton Univ. Press, 2006.
[28] N.J. Higham, Accuracy and Stability of Numerical Algorithms, second ed. SIAM Press, 2002.
[29] B.E. Boser, I.M. Guyon, and V.N. Vapnik, "A Training Algorithm for Optimal Margin Classifiers," Proc. Fifth Ann. ACM Workshop Computational Learning Theory, pp. 144-152, 1992.
[30] Y.-J. Lee and O.L. Mangasarian, "RSVM: Reduced Support Vector Machines," Proc. SIAM Int'l Conf. Data Mining, 2001.
[31] C. Lemaréchal, "An Algorithm for Minimizing Convex Functions," Proc. Int'l Federation for Information Processing Congress, pp. 552-556, 1974.
[32] P. Wolfe, "A Method of Conjugate Subgradients for Minimizing Non-Differentiable Functions," Nondifferentiable Optimization, M. Balinski and P. Wolfe, eds., pp. 145-173, Springer, 1975.
[33] M.M. Makela, "Survey of Bundle Methods for Nonsmooth Optimization," Optimization Methods and Software, vol. 17, no. 1, pp. 1-29, 2001.
[34] R.P. Sheridan, K.R. Korzekwa, R.A. Torres, and M.J. Walker, "Empirical Regioselectivity Models for Human Cytochromes P450 3A4, 2D6, and 2C9," J. Medicinal Chemistry, vol. 50, pp. 3173-3184, 2007.
[35] S. Rendic, "Summary of Information on Human CYP Enzymes: Human P450 Metabolism Data," Drug Metabolism Rev., vol. 34, nos. 1/2, pp. 83-448, 1997.
[36] C.M. Brown, B. Reisfeld, and A.N. Mayeno, "Cytochromes P450: A Structure-Based Summary of Biotransformations Using Representative Substrates," Drug Metabolissm Rev., vol. 40, pp. 1-100, 2008.
[37] S.B. Singh, L.Q. Shen, M.J. Walker, and R.P. Sheridan, "A Model for Predicting Likely Sites of CYP3A4-Mediated Metabolism on Drug-Like Molecules," J. Medicinal Chemistry, vol. 46, pp. 1330-1336, 2003.

Index Terms:
pattern classification,convex programming,gradient methods,learning (artificial intelligence),computational chemistry,bundle algorithm,multiple-instance learning,multiple-instance classification,multiple-instance ranking,multiple-instance loss functions,smooth nonconvex optimization problems,linear-time subgradient-based methods,support vector machines,nonconvex bundle method,Kernel,Compounds,Microwave integrated circuits,Drugs,Computational modeling,Support vector machines,Optimization,medicine and science.,Artificial intelligence,machine learning,nonsmooth optimization,bundle methods,multiple-instance learning,ranking
J. Zaretzki, G. Moore, C. Bergeron, C. M. Breneman, K. P. Bennett, "Fast Bundle Algorithm for Multiple-Instance Learning," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 6, pp. 1068-1079, June 2012, doi:10.1109/TPAMI.2011.194
Usage of this product signifies your acceptance of the Terms of Use.