The Community for Technology Leaders
RSS Icon
Issue No.01 - Jan. (2013 vol.24)
pp: 104-117
Jun Zhang , Deakin University, Melbourne
Yang Xiang , Deakin University, Melbourne
Yu Wang , Deakin University, Melbourne
Wanlei Zhou , Deakin University, Melbourne
Yong Xiang , Deakin University, Melbourne
Yong Guan , Iowa State University, Ames
Traffic classification has wide applications in network management, from security monitoring to quality of service measurements. Recent research tends to apply machine learning techniques to flow statistical feature based classification methods. The nearest neighbor (NN)-based method has exhibited superior classification performance. It also has several important advantages, such as no requirements of training procedure, no risk of overfitting of parameters, and naturally being able to handle a huge number of classes. However, the performance of NN classifier can be severely affected if the size of training data is small. In this paper, we propose a novel nonparametric approach for traffic classification, which can improve the classification performance effectively by incorporating correlated information into the classification process. We analyze the new classification approach and its performance benefit from both theoretical and empirical perspectives. A large number of experiments are carried out on two real-world traffic data sets to validate the proposed approach. The results show the traffic classification performance can be improved significantly even under the extreme difficult circumstance of very few training samples.
Correlation, Training data, Artificial neural networks, Training, Support vector machines, Accuracy, Robustness, security, Traffic classification, network operations
Jun Zhang, Yang Xiang, Yu Wang, Wanlei Zhou, Yong Xiang, Yong Guan, "Network Traffic Classification Using Correlation Information", IEEE Transactions on Parallel & Distributed Systems, vol.24, no. 1, pp. 104-117, Jan. 2013, doi:10.1109/TPDS.2012.98
[1] T. Karagiannis, K. Papagiannaki, and M. Faloutsos, "BLINC: Multilevel Traffic Classification in the Dark," Proc ACM SIGCOMM, vol. 35, pp. 229-240, Aug. 2005.
[2] T.T. Nguyen and G. Armitage, "A Survey Of Techniques for Internet Traffic Classification Using Machine Learning," IEEE Comm. Surveys Tutorials, vol. 10, no. 4, pp. 56-76, Oct.-Dec. 2008.
[3] H. Kim, K. Claffy, M. Fomenkov, D. Barman, M. Faloutsos, and K. Lee, "Internet Traffic Classification Demystified: Myths, Caveats, and the Best Practices," Proc. ACM CoNEXT Conf., pp. 1-12, 2008.
[4] Y. Wu, G. Min, K. Li, and B. Javadi, "Modelling and Analysis of Communication Networks in Multi-Cluster Systems Under Spatio-Temporal Bursty Traffic," IEEE Trans. Parallel Distributed Systems, vol. 23, no. 5, pp. 902-912, May 2012,
[5] Y.-s. Lim, H.-c. Kim, J. Jeong, C.-k. Kim, T.T. Kwon, and Y. Choi, "Internet Traffic Classification Demystified: on the Sources of the Discriminative Power," Proc. Sixth Int'l Conf. (Co-NEXT '10), pp. 9:1-9:12, 2010.
[6] Y. Xiang, W. Zhou, and M. Guo, "Flexible Deterministic Packet Marking: An IP Traceback System to Find the Real Source of Attacks," IEEE Trans. Parallel Distributed Systems, vol. 20, no. 4, pp. 567-580, Apr. 2009.
[7] A.W. Moore and D. Zuev, "Internet Traffic Classification Using Bayesian Analysis Techniques," ACM SIGMETRICS Performance Evaluation Review (SIGMETRICS), vol. 33, pp. 50-60, June 2005.
[8] P. Haffner, S. Sen, O. Spatscheck, and D. Wang, "ACAS: Automated Construction of Application Signatures," Proc. ACM SIGCOMM, pp. 197-202, 2005.
[9] L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian, "Traffic Classification on the Fly," Proc ACM SIGCOMM, vol. 36, pp. 23-26, Apr. 2006.
[10] J. Erman, A. Mahanti, M. Arlitt, I. Cohen, and C. Williamson, "Offline/Realtime Traffic Classification Using Semi-Supervised Learning," Performance Evaluation, vol. 64, nos. 9-12, pp. 1194-1213, Oct. 2007.
[11] N. Williams, S. Zander, and G. Armitage, "A Preliminary Performance Comparison of Five Machine Learning Algorithms for Practical IP Traffic Flow Classification," Proc ACM SIGCOMM, vol. 36, pp. 5-16, Oct. 2006.
[12] T. Auld, A.W. Moore, and S.F. Gull, "Bayesian Neural Networks for Internet Traffic Classification," IEEE Trans. Neural Networks, vol. 18, no. 1, pp. 223-239, Jan. 2007.
[13] M. Roughan, S. Sen, O. Spatscheck, and N. Duffield, "Class-of-Service Mapping for QoS: A Statistical Signature-Based Approach to IP Traffic Classification," Proc. ACM SIGCOMM, pp. 135-148, 2004.
[14] R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification. Wiley, 2001.
[15] T. Nguyen and G. Armitage, "Training on Multiple Sub-Flows to Optimise the Use of Machine Learning Classifiers in Real-World IP Networks," Proc. IEEE Ann. Conf. Local Computer Networks, pp. 369-376, 2006.
[16] J. Erman, A. Mahanti, M. Arlitt, and C. Williamson, "Identifying and Discriminating between Web and Peer-to-Peer Traffic in the Network Core," Proc. 16th Int'l Conf. World Wide Web, pp. 883-892, 2007.
[17] L. Bernaille and R. Teixeira, "Early Recognition of Encrypted Applications," Proc. Eight Int'l Conf. Passive and Active Network Measurement, pp. 165-175, 2007.
[18] D. Bonfiglio, M. Mellia, M. Meo, D. Rossi, and P. Tofanelli, "Revealing Skype Traffic: When Randomness Plays with You," Proc. Conf. Applications, Technologies, Architectures, and Protocols for Computer Comm., pp. 37-48, 2007.
[19] M. Crotti, M. Dusi, F. Gringoli, and L. Salgarelli, "Traffic Classification through Simple Statistical Fingerprinting," Proc ACM SIGCOMM, vol. 37, pp. 5-16, Jan. 2007.
[20] M. Crotti, F. Gringoli, and L. Salgarelli, "Optimizing Statistical Classifiers of Network Traffic," Proc. Sixth Int'l Wireless Comm. and Mobile Computing Conf., pp. 758-763, 2010.
[21] A. Este, F. Gringoli, and L. Salgarelli, "Support Vector Machines for TCP Traffic Classification," Computer Networks, vol. 53, no. 14, pp. 2476-2490, Sept. 2009.
[22] S. Valenti, D. Rossi, M. Meo, M. Mellia, and P. Bermolen, "Accurate, Fine-Grained Classification of P2P-TV Applications by Simply Counting Packets," Proc. Int'l Workshop Traffic Monitoring and Analysis, pp. 84-92, 2009.
[23] M. Pietrzyk, J.-L. Costeux, G. Urvoy-Keller, and T. En-Najjary, "Challenging Statistical Classification for Operational Usage: the ADSL Case," Proc. Ninth ACM SIGCOMM, pp. 122-135, 2009.
[24] A. Finamore, M. Mellia, M. Meo, and D. Rossi, "KISS: Stochastic Packet Inspection Classifier for UDP Traffic," IEEE/ACM Trans. Networking, vol. 18, no. 5, pp. 1505-1515, Oct. 2010.
[25] A. McGregor, M. Hall, P. Lorier, and J. Brunskill, "Flow Clustering Using Machine Learning Techniques," Proc. Passive and Active Measurement Workshop, pp. 205-214, Apr. 2004.
[26] S. Zander, T. Nguyen, and G. Armitage, "Automated Traffic Classification and Application Identification Using Machine Learning," Proc. IEEE Ann. Conf. Local Computer Networks, pp. 250-257, 2005.
[27] J. Erman, M. Arlitt, and A. Mahanti, "Traffic Classification Using Clustering Algorithms," Proc ACM SIGCOMM, pp. 281-286, 2006.
[28] J. Erman, A. Mahanti, and M. Arlitt, "Internet Traffic Identification Using Machine Learning," Proc. IEEE Global Telecomm. Conf., pp. 1-6, 2006.
[29] Y. Wang, Y. Xiang, and S.-Z. Yu, "An Automatic Application Signature Construction System for Unknown Traffic," Concurrency and Computation: Practice and Experience, vol. 22, pp. 1927-1944, 2010.
[30] A. Finamore, M. Mellia, and M. Meo, "Mining Unclassified Traffic Using Automatic Clustering Techniques," Proc. Third Int'l Traffic Monitoring and Analysis (TMA), pp. 150-163, Apr. 2011.
[31] Weka 3: Data Mining Software in Java. http://www.cs.waikato., 2012.
[32] J. Zhang and L. Ye, "Image Retrieval Based on Bag of Images," Proc. IEEE Int'l Conf. Image Processing, pp. 1865-1868, Nov. 2009.
[33] O. Boiman, E. Shechtman, and M. Irani, "In Defense of Nearest-Neighbor Based Image Classification," IEEE Int'l Conf. Computer Vision and Pattern Recognition, pp. 1-8, June 2008.
[34] J. Ma, K. Levchenko, C. Kreibich, S. Savage, and G.M. Voelker, "Unexpected Means of Protocol Inference," Proc. Sixth ACM SIGCOMM, pp. 313-326, 2006.
[35] M. Canini, W. Li, M. Zadnik, and A.W. Moore, "Experience with High-Speed Automated Application-Identification for Network-Management," Proc. Fifth ACM/IEEE Symp. Architectures for Networking and Comm. Systems, pp. 209-218, 2009.
[36] Y. Wang, Y. Xiang, J. Zhang, and S.-Z. Yu, "A Novel Semi-Supervised Approach for Network Traffic Clustering," Proc. Int'l Conf. Network and System Security, Sept. 2011.
[37] Network Traffic Tracing at SIGCOMM 2008, http://www.cs.umd. edu/projects/wifidelity tracing, 2012.
[38] R. Pang, M. Allman, M. Bennett, J. Lee, V. Paxson, and B. Tierney, "A First Look at Modern Enterprise Traffic," Proc. ACM SIGCOMM, pp. 15-28, 2005.
[39] MAWI Working Group Traffic Archive,, 2012.
[40] A. Webb, Statistical Pattern Recognition. John Wiley & Sons, 2002.
[41] I. Guyon and A. Elisseeff, "An Introduction to Variable and Feature Selection," J. Machine Learning Research, vol. 3, pp. 1157-1182, Mar. 2003.
[42] M.A. Hall, "Correlation-Based Feature Selection for Machine Learning," PhD Thesis, Department of Computer Science, The Univ. of Waikato, Hamilton, New Zealand, Apr. 1999.
28 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool