Subscribe
Issue No.08 - August (2008 vol.20)
pp: 1053-1066
ABSTRACT
Impact-targeted activities are rare but lead to significant impact on the society, e.g., isolated terrorism activities may lead to a disastrous event threatening national security. Similar issues can also be seen in many other areas. Therefore, it is important to identify such particular activities before they lead to significant impact to the world. However, it is challenging to mine impact-targeted activity patterns due to its imbalanced structure. This paper develops techniques for discovering such activity patterns. First, the complexities of mining imbalanced impact-targeted activities are analyzed.We then discuss strategies for constructing impact-targeted activity sequences. Algorithms are developed to mine frequent positive-impact (P → T) and negative-impact (P → $(\bar{T})$) oriented activity patterns, sequential impact-contrasted activity patterns (P is frequently associated with both pattern P → T and P → $(\bar{T})$) in separated data sets), and sequential impact-reversed activity patterns (both P → T and PQ → $(\bar{T})$) are frequent). Activity impact modelling is also studied to quantify pattern impact on business outcomes. Social security debt-related activity data is used to test the proposed approaches. The outcomes show that they are promising for ISI applications to identify impact-targeted activity patterns in imbalanced data.
INDEX TERMS
Clustering, classification, and association rules, data mining
CITATION
Longbing Cao, Yanchang Zhao, Chengqi Zhang, "Mining Impact-Targeted Activity Patterns in Imbalanced Data", IEEE Transactions on Knowledge & Data Engineering, vol.20, no. 8, pp. 1053-1066, August 2008, doi:10.1109/TKDE.2007.190635
REFERENCES