The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - May (2012 vol.24)
pp: 854-867
Jose Antonio Iglesias , Carlos III University of Madrid, Madrid
Plamen Angelov , Lancaster University, Lancaster
Agapito Ledezma , Carlos III University of Madrid, Madrid
Araceli Sanchis , Carlos III University of Madrid, Madrid
ABSTRACT
Knowledge about computer users is very beneficial for assisting them, predicting their future actions or detecting masqueraders. In this paper, a new approach for creating and recognizing automatically the behavior profile of a computer user is presented. In this case, a computer user behavior is represented as the sequence of the commands she/he types during her/his work. This sequence is transformed into a distribution of relevant subsequences of commands in order to find out a profile that defines its behavior. Also, because a user profile is not necessarily fixed but rather it evolves/changes, we propose an evolving method to keep up to date the created profiles using an Evolving Systems approach. In this paper, we combine the evolving classifier with a trie-based user profiling to obtain a powerful self-learning online scheme. We also develop further the recursive formula of the potential of a data point to become a cluster center using cosine distance, which is provided in the Appendix. The novel approach proposed in this paper can be applicable to any problem of dynamic/evolving user behavior modeling where it can be represented as a sequence of actions or events. It has been evaluated on several real data streams.
INDEX TERMS
Evolving fuzzy systems, fuzzy-rule-based (FRB) classifiers, user modeling.
CITATION
Jose Antonio Iglesias, Plamen Angelov, Agapito Ledezma, Araceli Sanchis, "Creating Evolving User Behavior Profiles Automatically", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 5, pp. 854-867, May 2012, doi:10.1109/TKDE.2011.17
REFERENCES
[1] D. Godoy and A. Amandi, "User Profiling in Personal Information Agents: A Survey," Knowledge Eng. Rev., vol. 20, no. 4, pp. 329-361, 2005.
[2] J.A. Iglesias, A. Ledezma, and A. Sanchis, "Creating User Profiles from a Command-Line Interface: A Statistical Approach," Proc. Int'l Conf. User Modeling, Adaptation, and Personalization (UMAP), pp. 90-101, 2009.
[3] M. Schonlau, W. Dumouchel, W.H. Ju, A.F. Karr, and Theus, "Computer Intrusion: Detecting Masquerades," Statistical Science, vol. 16, pp. 58-74, 2001.
[4] R.A. Maxion and T.N. Townsend, "Masquerade Detection Using Truncated Command Lines," Proc. Int'l Conf. Dependable Systems and Networks (DSN), pp. 219-228, 2002.
[5] A. Alaniz-Macedo, K.N. Truong, J.A. Camacho-Guerrero, and M. Graca-Pimentel, "Automatically Sharing Web Experiences through a Hyperdocument Recommender System," Proc. ACM Conf. Hypertext and Hypermedia (HYPERTEXT '03), pp. 48-56, 2003.
[6] D.L. Pepyne, J. Hu, and W. Gong, "User Profiling for Computer Security," Proc. Am. Control Conf., pp. 982-987, 2004.
[7] D. Godoy and A. Amandi, "User Profiling for Web Page Filtering," IEEE Internet Computing, vol. 9, no. 4, pp. 56-64, July/Aug. 2005.
[8] J. Anderson, Learning and Memory: An Integrated Approach. John Wiley and Sons, 1995.
[9] Y. Horman and G.A. Kaminka, "Removing Biases in Unsupervised Learning of Sequential Patterns," Intelligent Data Analysis, vol. 11, no. 5, pp. 457-480, 2007.
[10] T. Lane and C.E. Brodley, "Temporal Sequence Learning and Data Reduction for Anomaly Detection," Proc. ACM Conf. Computer and Comm. Security (CCS), pp. 150-158, 1998.
[11] S.E. Coull, J.W. Branch, B.K. Szymanski, and E. Breimer, "Intrusion Detection: A Bioinformatics Approach," Proc. Ann. Computer Security Applications Conf. (ACSAC), pp. 24-33, 2003.
[12] P. Angelov and X. Zhou, "Evolving Fuzzy Rule-Based Classifiers from Data Streams," IEEE Trans. Fuzzy Systems: Special Issue on Evolving Fuzzy Systems, vol. 16, no. 6, pp. 1462-1475, Dec. 2008.
[13] M. Panda and M.R. Patra, "A Comparative Study of Data Mining Algorithms for Network Intrusion Detection," Proc. Int'l Conf. Emerging Trends in Eng. and Technology, pp. 504-507, 2008.
[14] A. Cufoglu, M. Lohi, and K. Madani, "A Comparative Study of Selected Classifiers with Classification Accuracy in User Profiling," Proc. WRI World Congress on Computer Science and Information Eng. (CSIE), pp. 708-712, 2009.
[15] R. Polikar, L. Upda, S.S. Upda, and V. Honavar, "Learn++: An Incremental Learning Algorithm for Supervised Neural Networks," IEEE Trans. Systems, Man and Cybernetics, Part C (Applications and Rev.), vol. 31, no. 4, pp. 497-508, http://dx.doi.org/10.11095326.983933, Nov. 2001.
[16] D. Kalles and T. Morris, "Efficient Incremental Induction of Decision Trees," Machine Learning, vol. 24, no. 3, pp. 231-242, 1996.
[17] F.J. Ferrer-Troyano, J.S. Aguilar-Ruiz, and J.C.R. Santos, "Data Streams Classification by Incremental Rule Learning with Parameterized Generalization," Proc. ACM Symp. Applied Computing (SAC), pp. 657-661, 2006.
[18] J.C. Schlimmer and D.H. Fisher, "A Case Study of Incremental Concept Induction," Proc. Fifth Nat'l Conf. Artificial Intelligence (AAAI), pp. 496-501, 1986.
[19] P.E. Utgoff, "Id5: An Incremental Id3," Proc. Int'l Conf. Machine Learning, pp. 107-120, 1988.
[20] P.E. Utgoff, "Incremental Induction of Decision Trees," Machine Learning, vol. 4, no. 2, pp. 161-186, 1989.
[21] G.A. Carpenter, S. Grossberg, and D.B. Rosen, "Art2-a: An Adaptive Resonance Algorithm for Rapid Category Learning and Recognition," Neural Networks, vol. 4, pp. 493-504, 1991.
[22] G.A. Carpenter, S. Grossberg, N. Markuzon, J.H. Reynolds, and D.B. Rosen, "Fuzzy Artmap: A Neural Network Architecture for Incremental Supervised Learning of Analog Multidimensional Maps," IEEE Trans. Neural Networks, vol. 3, no. 5, pp. 698-713, Sept. 1992.
[23] N. Kasabov, "Evolving Fuzzy Neural Networks for Supervised/Unsupervised Online Knowledge-Based Learning," IEEE Trans. Systems, Man and Cybernetics—Part B: Cybernetics, vol. 31, no. 6, pp. 902-918, Dec. 2001.
[24] T. Seipone and J.A. Bullinaria, "Evolving Improved Incremental Learning Schemes for Neural Network Systems," Proc. IEEE Congress on Evolutionary Computation, pp. 2002-2009, 2005.
[25] T. Kohonen, J. Kangas, J. Laaksonen, and K. Torkkola, "Lvq pak: A Program Package for the Correct Application of Learning Vector Quantization Algorithms," Proc. IEEE Int'l Conf. Neural Networks, pp. 725-730, 1992.
[26] F. Poirier and A. Ferrieux, "Dvq: Dynamic Vector Quantization —An Incremental Lvq," Proc. Int'l Conf. Artificial Neural Networks, pp. 1333-1336, 1991.
[27] R.K. Agrawal and R. Bala, "Incremental Bayesian Classification for Multivariate Normal Distribution Data," Pattern Recognition Letters, vol. 29, no. 13, pp. 1873-1876, http://dx.doi.org/10.1016j.patrec.2008.06.010 , 2008.
[28] K. M, A. Chai, H.L. Chieu, and H.T. Ng, "Bayesian Online Classifiers for Text Classification and Filtering," Proc. Int'l Conf. Research and Development in Information Retrieval (SIGIR), pp. 97-104, 2002.
[29] R. Xiao, J. Wang, and F. Zhang, "An Approach to Incremental SVM Learning Algorithm," Proc. IEEE Int'l Conf. Tools with Artificial Intelligence, pp. 268-278, 2000.
[30] G. Widmer and M. Kubat, "Learning in the Presence of Concept Drift and Hidden Contexts," Machine Learning, vol. 23, pp. 69-101, 1996.
[31] P. Riley and M.M. Veloso, "On Behavior Classification in Adversarial Environments," Proc. Int'l Symp. Distributed Autonomous Robotic Systems (DARS), pp. 371-380, 2000.
[32] E. Fredkin, "Trie Memory," Comm. ACM, vol. 3, no. 9, pp. 490-499, 1960.
[33] J.A. Iglesias, A. Ledezma, and A. Sanchis, "Sequence Classification Using Statistical Pattern Recognition," Proc. Int'l Conf. Intelligent Data Analysis (IDA), pp. 207-218, 2007.
[34] G.A. Kaminka, M. Fidanboylu, A. Chang, and M.M. Veloso, "Learning the Sequential Coordinated Behavior of Teams from Observations," Proc. RoboCup Symp., pp. 111-125, 2002.
[35] J.A. Iglesias, A. Ledezma, and A. Sanchis, "A Comparing Method of Two Team Behaviours in the Simulation Coach Competition," Proc. Int'l Conf. Modeling Decisions for Artificial Intelligence (MDAI), pp. 117-128, 2006.
[36] R. Agrawal and R. Srikant, "Mining Sequential Patterns," Proc. Int'l Conf. Data Eng., pp. 3-14, 1995.
[37] P. Angelov and D. Filev, "An Approach to Online Identification of Takagi-Sugeno Fuzzy Models," IEEE Trans. Systems, Man, and Cybernetics, Part B, vol. 34, no. 1, pp. 484-498, Feb. 2004.
[38] P. Angelov, X. Zhou, and F. Klawonn, "Evolving Fuzzy Rule-Based Classifiers," Proc. IEEE Symp. Computational Intelligence in Image and Signal Processing (CIISP '07), pp. 220-225, 2007.
[39] X. Zhou and P. Angelov, "Autonomous Visual Self-Localization in Completely Unknown Environment Using Evolving Fuzzy Rule-Based Classifier," Proc. IEEE Symp. Computational Intelligence in Security and Defense Applications (CISDA), pp. 131-138, 2007.
[40] P. Angelov and D. Filev, "Simpl_ets: A Simplified Method for Learning Evolving Takagi-Sugeno Fuzzy Models," Proc. IEEE Int'l Conf. Fuzzy Systems (IEEE-FUZZ), pp. 1068-1073, 2005.
[41] S. Greenberg, "Using Unix: Collected Traces of 168 Users," master's thesis, Dept. of Computer Science, Univ. of Calgary, Alberta, Canada, 1988.
[42] J. Quinlan, "Data Mining Tools See5 and c5.0," http://www.rulequest.comsee5-info.html, 2003.
[43] J.R. Quinlan, C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[44] R.O. Duda and P.E. Hart, Pattern Classification and Scene Analysis. John Wiley & Sons, 1973.
[45] G. John and P. Langley, "Estimating Continuous Distributions in Bayesian Classifiers," Proc. Conf. Uncertainty in Artificial Intelligence, pp. 338-345, 1995.
[46] T. Cover and P. Hart, "Nearest Neighbor Pattern Classification," IEEE Trans. Information Theory, vol. 13, no. 1, pp. 21-27, Jan. 1967.
[47] Y. Freund and R.E. Schapire, "A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting," J. Computer and System Sciences, vol. 55, no. 1, pp. 119-139, 1997.
[48] J.H. Morra, Z. Tu, L.G. Apostolova, A. Green, A.W. Toga, and P.M. Thompson, "Comparison of Adaboost and Support Vector Machines for Detecting Alzheimer's Disease through Automated Hippocampal Segmentation," IEEE Trans. Medical Imaging, vol. 29, no. 1, pp. 30-43, Jan. 2010.
[49] J. Platt, "Machines Using Sequential Minimal Optimization," Advances in Kernel Methods—Support Vector Learning, B. Schoelkopf, C. Burges, and A. Smola, eds., MIT Press, 1998.
[50] Self-Organizing Maps, T. Kohonen, M.R. Schroeder and T.S. Huang, eds. Springer-Verlag, 2001.
24 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool