Computer Science and Information Engineering, World Congress on (2009)
Los Angeles, California USA
Mar. 31, 2009 to Apr. 2, 2009
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CSIE.2009.217
Outlier detection is the process of detecting the data objects which are grossly different from or inconsistent with the remaining set of data. Some of the important applications in the field of data mining are fraud detection, customer behavior analysis, and intrusion detection. There are number of good research algorithms for detecting outliers if the entire data is available and algorithms can operate in more than single passes to achieve the required results. Among the existing methods, LOF (Local outlier Factor) a density based method is very efficient in detecting all forms of outliers. LOF algorithm can not be directly applied to the datastream as the large number of nearest neighbor searches, LOF computation and lrd (local reachability distances) can make it highly inefficient for datastream. In this paper we propose a cluster based partitioning algorithm which can divide the stream in safe region and candidate regions. In Second phase apply LOF algorithm over these partitions separately with some slight enhancement for LOF computation over candidate region to achieve accurate results for finding most outstanding outliers. Several experiments on different dataset confirm that our technique can find better outliers with low computational cost than the direct LOF or compared to the other enhancements proposed for LOF.
outlier, datastream, mining, LOF
K. Li, W. Nisar, H. Wang, X. Lv and M. Elahi, "Detection of Local Outlier over Dynamic Data Streams Using Efficient Partitioning Method," 2009 WRI World Congress on Computer Science and Information Engineering, CSIE(CSIE), Los Angeles, CA, 2009, pp. 76-81.