The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.09 - Sept. (2012 vol.24)
pp: 1686-1698
Fabricio Breve , University of São Paulo, São Carlos
Liang Zhao , University of São Paulo, São Carlos
Marcos Quiles , Federal University of São Paulo (Unifesp), São José dos Campos
Witold Pedrycz , University of Alberta, Edmonton
Jiming Liu , Hong Kong Baptist University, Hong Kong
ABSTRACT
Semi-supervised learning is one of the important topics in machine learning, concerning with pattern classification where only a small subset of data is labeled. In this paper, a new network-based (or graph-based) semi-supervised classification model is proposed. It employs a combined random-greedy walk of particles, with competition and cooperation mechanisms, to propagate class labels to the whole network. Due to the competition mechanism, the proposed model has a local label spreading fashion, i.e., each particle only visits a portion of nodes potentially belonging to it, while it is not allowed to visit those nodes definitely occupied by particles of other classes. In this way, a “divide-and-conquer” effect is naturally embedded in the model. As a result, the proposed model can achieve a good classification rate while exhibiting low computational complexity order in comparison to other network-based semi-supervised algorithms. Computer simulations carried out for synthetic and real-world data sets provide a numeric quantification of the performance of the method.
INDEX TERMS
Supervised learning, Electronic mail, Computational modeling, Unsupervised learning, Machine learning, Labeling, Computational complexity, label propagation, Semi-supervised learning, particles competition and cooperation, network-based methods
CITATION
Fabricio Breve, Liang Zhao, Marcos Quiles, Witold Pedrycz, Jiming Liu, "Particle Competition and Cooperation in Networks for Semi-Supervised Learning", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 9, pp. 1686-1698, Sept. 2012, doi:10.1109/TKDE.2011.119
REFERENCES
[1] T. Mitchell, Machine Learning. McGraw Hill, 1997.
[2] C.M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
[3] E. Alpaydin, Introduction to Machine Learning. MIT Press, 2004.
[4] K.J. Cios, W. Pedrycz, R.W. Swiniarski, and L.A. Kurgan, Data Mining: A Knowledge Discovery Approach. Springer, 2007.
[5] C. Aggarwal and P. Yu, "A Survey of Uncertain Data Algorithms and Applications," IEEE Trans. Knowledge and Data Eng., vol. 21, no. 5, pp. 609-623, May 2009.
[6] R. Wolff, K. Bhaduri, and H. Kargupta, "A Generic Local Algorithm for Mining Data Streams in Large Distributed Systems," IEEE Trans. Knowledge and Data Eng., vol. 21, no. 4, pp. 465-478, Apr. 2009.
[7] X. Zhu, "Semi-Supervised Learning Literature Survey," Technical Report 1530, Computer Sciences, Univ. of Wisconsin-Madison, 2005.
[8] Semi-Supervised Learning, Adaptive Computation and Machine Learning, O. Chapelle, B. Schölkopf, and A. Zien, eds. The MIT Press, 2006.
[9] K. Nigam, A.K. Mccallum, S. Thrun, and T. Mitchell, "Text Classification from Labeled and Unlabeled Documents Using em," Machine Learning, vol. 39, pp. 103-134, 2000.
[10] A. Fujino, N. Ueda, and K. Saito, "A Hybrid Generative/Discriminative Approach to Semi-Supervised Classifier Design," Proc. 20h Nat'l Conf. Artificial Intelligence (AAAI '05), pp. 764-769, 2005.
[11] A. Demiriz, K.P. Bennett, and M.J. Embrechts, "Semi-Supervised Clustering Using Genetic Algorithms," Proc. Artificial Neural Networks in Eng. (ANNIE '99), pp. 809-814, 1999.
[12] R. Dara, S. Kremer, and D. Stacey, "Clustering Unlabeled Data with Soms Improves Classification of Labeled Real-World Data," Proc. World Congress Computational Intelligence (WCCI), pp. 2237-224, 22002.
[13] A. Blum and T. Mitchell, "Combining Labeled and Unlabeled Data with Co-Training," Proc. Workshop Computational Learning Theory (COLT), pp. 92-100, 1998.
[14] T.M. Mitchell, "The Role of Unlabeled Data in Supervised Learning," Proc. Sixth Int'l Colloquium Cognitive Science, 1999.
[15] Z.-H. Zhou and M. Li, "Tri-Training: Exploiting Unlabeled Data Using Three Classifiers," IEEE Trans. Knowledge and Data Eng., vol. 17, no. 11, pp. 1529-1541, Nov. 2005.
[16] Z.-H. Zhou and M. Li, "Semisupervised Regression with Cotraining-Style Algorithms," IEEE Trans. Knowledge and Data Eng., vol. 19, no. 11, pp. 1479-1493, Nov. 2007.
[17] V.N. Vapnik, Statistical Learning Theory. Wiley-Interscience, Sept. 1998.
[18] X. Zhu, Z. Ghahramani, and J. Lafferty, "Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions," Proc. 20th Int'l Conf. Machine Learning, pp. 912-919, 2003.
[19] D. Zhou, O. Bousquet, T.N. Lal, J. Weston, and B. Schölkopf, "Learning with Local and Global Consistency," Advances in Neural Information Processing Systems, vol. 16, pp. 321-328, 2004.
[20] M. Wu and B. Schölkopf, "Transductive Classification via Local Learning Regularization," J. Machine Learning Research, vol. 2, pp. 628-635, 2007.
[21] F. Wang, T. Li, G. Wang, and C. Zhang, "Semi-supervised Classification Using Local and Global Regularization," Proc. 23rd Nat'l Conf. Artificial intelligence (AAAI '08), pp. 726-731, 2008.
[22] M. Szummer and T. Jaakkola, "Partially Labeled Classification with Markov Random Walks," Proc. Advances in Neural Information Processing Systems, vol. 14, 2002.
[23] L. Grady, "Random Walks for Image Segmentation," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1768-1783, Nov. 2006.
[24] M. Maila and J. Shi, "Learning Segmentation with Random Walk," Proc. Advances Neural Information Processing Systems (NIPS), 2001.
[25] X. Zhu and Z. Ghahramani, "Learning from Labeled and Unlabeled Data with Label Propagation," Technical Report CMU-CALD-02-107, Carnegie Mellon Univ., Pittsburgh, 2002.
[26] F. Wang and C. Zhang, "Label Propagation through Linear Neighborhoods," IEEE Trans. Knowledge and Data Eng., vol. 20, no. 1, pp. 55-67, Jan. 2008.
[27] W. Wang and Z.-H. Zhou, "A New Analysis of Co-Training," Proc. 27th Int'l Conf. Machine Learning (ICML '10), J. Fürnkranz and T. Joachims, eds., pp. 1135-1142, 2010.
[28] A. Blum and S. Chawla, "Learning from Labeled and Unlabeled Data Using Graph Mincuts," Proc. 18th Int'l Conf. Machine Learning, pp. 19-26, 2001,
[29] M. Belkin, I. Matveeva, and P. Niyogi, "Regularization and Semisupervised Learning on Large Graphs," Proc. Conf. Learning Theory, pp. 624-638, 2004.
[30] M. Belkin, N.P., and V. Sindhwani, "On Manifold Regularization," Proc. 10th Int'l Workshop Artificial Intelligence and Statistics (AISTAT '05), pp. 17-24, 2005.
[31] T. Joachims, "Transductive Learning via Spectral Graph Partitioning," Proc. Int'l Conf. Machine Learning, pp. 290-297, 2003.
[32] F. Wang, S. Wang, C. Zhang, and O. Winther, "Semi-Supervised Mean Fields," Proc. 11th Int'l Conf. Artificial Intelligence and Statistics (AISTATS '07), Mar. 2007.
[33] G. Getz, N. Shental, and E. Domany, "Semi-Supervised Learning —A Statistical Physics Approach," Proc. 22nd ICML Workshop Learning with Partially Classified Training Data, 2005.
[34] F. Wang and C. Zhang, "Semi-Supervised Learning Based on Generalized Point Charge Models," IEEE Trans. Neural Networks, vol. 19, no. 7, pp. 1307-1311, July 2008.
[35] W. Liu, J. He, and S.-F. Chang, "Large Graph Construction for Scalable Semi-Supervised Learning," Proc. Int'l Conf. Machine Learning (ICML), J. Fürnkranz and T. Joachims, eds., pp. 679-686, 2010.
[36] M.E.J. Newman, "The Structure and Function of Complex Networks," SIAM Rev., vol. 45, pp. 167-256, 2003.
[37] S. Bornholdt and H. Schuster, Handbook of Graphs and Networks: From the Genome to the Internet. Wiley-VCH, 2006.
[38] L. Danon, A. Díaz-Guilera, J. Duch, and A. Arenas, "Comparing Community Structure Identification," J. Statistical Mechanics: Theory and Experiment, vol. 9, p. P09008 (1-10), 2005.
[39] S. Fortunato, "Community Detection in Graphs," Physics Reports, vol. 486, pp. 75-174, 2010.
[40] M.G. Quiles, L. Zhao, R.L. Alonso, and R.A.F. Romero, "Particle Competition for Complex Network Community Detection," Chaos, vol. 18, no. 3, pp. 033107 (1-10), 2008.
[41] R. Duin, P. Juszczak, P. Paclik, E. Pekalska, D. de Ridder, D. Tax, and S. Verzakov, "Prtools4.1, a Matlab Toolbox for Pattern Recognition," Delft Univ. of Tech nology, 2007.
[42] V.N. Vapnik, The Nature of Statistical Learning Theory. Springer-Verlag, 1995.
[43] K.Q. Weinberger and L.K. Saul, "Unsupervised Learning of Image Manifolds by Semidefinite Programming," Proc. IEEE CS Conf. Computer Vision and Pattern Recognition, vol. 2, pp. 988-995, 2004.
[44] J. Sun, S. Boyd, L. Xiao, and P. Diaconis, "The Fastest Mixing Markov Process on a Graph and a Connection to a Maximum Variance Unfolding Problem," SIAM Rev., vol. 48, no. 4, pp. 681-699, 2006.
[45] M. Belkin and P. Niyogi, "Laplacian Eigenmaps for Dimensionality Reduction and Data Representation," Neural Computation, vol. 15, no. 6, pp. 1373-1396, 2003.
[46] O. Delalleau, Y. Bengio, and N.L. Roux, "Efficient Non-Parametric Function Induction in Semi-Supervised Learning," Proc. 10th Int'l Workshop Artificial Intelligence and Statistics, pp. 96-103, 2005.
[47] D. Zhou and B. Schölkopf, "Discrete Regularization," Semi-Supervised Learning, pp. 237-250, MIT Press, 2006.
[48] O. Chapelle and A. Zien, "Semi-Supervised Classification by Low Density Separation," Proc. 10th Int'l Workshop Artificial Intelligence and Statistics, pp. 57-64, 2005.
[49] O. Chapelle, J. Weston, and B. Schölkopf, "Cluster Kernels for Semi-Supervised Learning," Proc. Advances in Neural Information Processing Systems, vol. 15, 2003.
[50] A. Corduneanu and T. Jaakkola, "Data-Dependent Regularization," Semi-Supervised Learning, pp. 163-190, MIT Press, 2006.
[51] V. Sindhwani, P. Niyogi, and M. Belkin, "Beyond the Point Cloud: From Transductive to Semi-Supervised Learning," Proc. 22nd Int'l Conf. Machine learning (ICML '05), pp. 824-831, 2005.
[52] C.J.C. Burges and J.C. Platt, "Semi-Supervised Learning with Conditional Harmonic Mixing," Semi-Supervised Learning, pp. 251-273, MIT Press, 2006.
[53] J. Hull, "A Database for Handwritten Text Recognition Research," IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 16, no. 5, pp. 550-554, May 1994.
[54] P. van der Putten and M. van Someren, "Coil Challenge 2000: The Insurance Company Case," Technical Report 2000-09, Sentient Machine Research and Leiden Inst. of Advanced Computer Science, Amsterdam and Leiden, 2000.
[55] A. Asuncion and D. Newman, "UCI Machine Learning Repository," http://www.ics.uci.edu/mlearnMLRepository.html , 2007.
43 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool