The Community for Technology Leaders
RSS Icon
Issue No.03 - March (2012 vol.24)
pp: 533-546
Clifton Phua , Institute of Infocomm Research, Singapore
Kate Smith-Miles , Monash University, Melbourne
Vincent Cheng-Siong Lee , Monash University, Melbourne
Ross Gayler , Veda Advantage, Melbourne
Identity crime is well known, prevalent, and costly; and credit application fraud is a specific case of identity crime. The existing nondata mining detection system of business rules and scorecards, and known fraud matching have limitations. To address these limitations and combat identity crime in real time, this paper proposes a new multilayered detection system complemented with two additional layers: communal detection (CD) and spike detection (SD). CD finds real social relationships to reduce the suspicion score, and is tamper resistant to synthetic social relationships. It is the whitelist-oriented approach on a fixed set of attributes. SD finds spikes in duplicates to increase the suspicion score, and is probe-resistant for attributes. It is the attribute-oriented approach on a variable-size set of attributes. Together, CD and SD can detect more types of attacks, better account for changing legal behavior, and remove the redundant attributes. Experiments were carried out on CD and SD with several million real credit applications. Results on the data support the hypothesis that successful credit application fraud patterns are sudden and exhibit sharp spikes in duplicates. Although this research is specific to credit application fraud detection, the concept of resilience, together with adaptivity and quality data discussed in the paper, are general to the design, implementation, and evaluation of all detection systems.
Data mining-based fraud detection, security, data stream mining, anomaly detection.
Clifton Phua, Kate Smith-Miles, Vincent Cheng-Siong Lee, Ross Gayler, "Resilient Identity Crime Detection", IEEE Transactions on Knowledge & Data Engineering, vol.24, no. 3, pp. 533-546, March 2012, doi:10.1109/TKDE.2010.262
[1] A. Bifet and R. Kirkby Massive Online Analysis, Technical Manual, Univ. of Waikato, 2009.
[2] R. Bolton, and D. Hand, "Unsupervised Profiling Methods for Fraud Detection," Statistical Science, vol. 17, no. 3, pp. 235-255, 2001.
[3] P. Brockett, R. Derrig, L. Golden, A. Levine, and M. Alpert, "Fraud Classification Using Principal Component Analysis of RIDITs," The J. Risk and Insurance, vol. 69, no. 3, pp. 341-371, 2002, doi: 10.1111/1539-6975.00027.
[4] R. Caruana and A. Niculescu-Mizil, "Data Mining in Metric Space: An Empirical Analysis of Supervised Learning Performance Criteria," Proc. 10th ACM SIGKDD Int'l Conf. Knowledge Discovery and Data Mining (KDD '04), 2004, doi: 10.1145/1014052.1014063.
[5] P. Christen and K. Goiser, "Quality and Complexity Measures for Data Linkage and Deduplication," Quality Measures in Data Mining, F. Guillet and H. Hamilton, eds., vol. 43, Springer, 2007, doi: 10.1007/978-3-540-44918-8.
[6] C. Cortes, D. Pregibon, and C. Volinsky, "Computational Methods for Dynamic Graphs," J. Computational and Graphical Statistics, vol. 12, no. 4, pp. 950-970, 2003, doi: 10.1198/1061860032742.
[7] Experian. Experian Detect: Application Fraud Prevention System, Whitepaper, , 2008.
[8] T. Fawcett, "An Introduction to ROC Analysis," Pattern Recognition Letters, vol. 27, pp. 861-874, 2006, doi: 10.1016/j.patrec. 2005.10.010.
[9] A. Goldenberg, G. Shmueli, R. Caruana, and S. Fienberg, "Early Statistical Detection of Anthrax Outbreaks by Tracking Over-the-Counter Medication Sales," Proc. Nat'l Academy of Sciences USA (PNAS '02), vol. 99, no. 8, pp. 5237-5240, 2002.
[10] G. Gordon, D. Rebovich, K. Choo, and J. Gordon, "Identity Fraud Trends and Patterns: Building a Data-Based Foundation for Proactive Enforcement," Center for Identity Management and Information Protection, Utica College, 2007.
[11] D. Hand, "Classifier Technology and the Illusion of Progress," Statistical Science, vol. 21, no. 1, pp. 1-15, 2006, doi: 10.1214/088342306000000060.
[12] B. Head, "Biometrics Gets in the Picture," Information Age, pp. 10-11, Aug.-Sept. 2006.
[13] L. Hutwagner, W. Thompson, G. Seeman, and T. Treadwell, "The Bioterrorism Preparedness and Response Early Aberration Reporting System (EARS)," J. Urban Health, vol. 80, pp. 89-96, 2006.
[14] IDAnalytics, "ID Score-Risk: Gain Greater Visibility into Individual Identity Risk," Unpublished, 2008.
[15] M. Jackson, A. Baer, I. Painter, and J. Duchin, "A Simulation Study Comparing Aberration Detection Algorithms for Syndromic Surveillance," BMC Medical Informatics and Decision Making, vol. 7, no. 6, 2007, doi: 10.1186/1472-6947-7-6.
[16] J. Jonas, "Non-Obvious Relationship Awareness (NORA)," Proc. Identity Mashup, 2006.
[17] M. Kantarcioglu, W. Jiang, and B. Malin, "A Privacy-Preserving Framework for Integrating Person-Specific Databases," Proc. UNESCO Chair in Data Privacy Int'l Conf. Privacy in Statistical Databases (PSD '08), pp. 298-314, 2008, doi: 10.1007/978-3-540-87471-3_25.
[18] J. Kleinberg, "Temporal Dynamics of On-Line Information Streams," Data Stream Management: Processing High-Speed Data Streams, M. Garofalakis, J. Gehrke, and R. Rastogi, eds., Springer, 2005.
[19] O. Kursun, A. Koufakou, B. Chen, M. Georgiopoulos, K. Reynolds, and R. Eaglin, "A Dictionary-Based Approach to Fast and Accurate Name Matching in Large Law Enforcement Databases," Proc. IEEE Int'l Conf. Intelligence and Security Informatics (ISI '06), pp. 72-82, 2006, doi: 10.1007/11760146.
[20] J. Neville, O. Simsek, D. Jensen, J. Komoroske, K. Palmer, and H. Goldberg, "Using Relational Knowledge Discovery to Prevent Securities Fraud," Proc. 11th ACM SIGKDD Int'l Conf. Knowledge Discovery in Data Mining (KDD '05), 2005, doi: 10.1145/1081870.1081922.
[21] T. Oscherwitz, "Synthetic Identity Fraud: Unseen Identity Challenge," Bank Security News, vol. 3, p. 7, 2005.
[22] S. Roberts, "Control-Charts-Tests Based on Geometric Moving Averages," Technometrics, vol. 1, pp. 239-250, 1959.
[23] S. Romanosky, R. Sharp, and A. Acquisti, "Data Breaches and Identity Theft: When Is Mandatory Disclosure Optimal?," Proc. Ninth Workshop Economics of Information Security (WEIS), 2010.
[24] B. Schneier, Beyond Fear: Thinking Sensibly about Security in an Uncertain World. Copernicus, 2003.
[25] B. Schneier,Schneier on Security. Wiley, 2008.
[26] L. Sweeney, "$k$ -Anonymity: A Model for Protecting Privacy," Int'l J. Uncertainty, vol. 10, no. 5, pp. 557-570, 2002.
[27] R. Wheeler and S. Aitken, "Multiple Algorithms for Fraud Detection," Knowledge-Based Systems, vol. 13, no. 3, pp. 93-99, 2000, doi: 10.1016/S0950-7051(00)00050-2.
[28] W. Winkler, "Overview of Record Linkage and Current Research Directions," Technical Report RR 2006-2, US Census Bureau, 2006.
[29] I. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java. Morgan Kauffman, 2000.
[30] W. Wong, "Data Mining for Early Disease Outbreak Detection," PhD thesis, Carnegie Mellon Univ., 2004.
[31] W. Wong, A. Moore, G. Cooper, and M. Wagner, "Bayesian Network Anomaly Pattern Detection for Detecting Disease Outbreaks," Proc. 20th Int'l Conf. Machine Learning (ICML '03), pp. 808-815, 2003.
22 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool