The Community for Technology Leaders
2015 IEEE 31st International Conference on Data Engineering (ICDE) (2015)
Seoul, South Korea
April 13, 2015 to April 17, 2015
ISBN: 978-1-4799-7964-6
pp: 579-590
Yuhong Li , Department of Computer and Information Science, University of Macau, Av. Padre Tomás Pereira Taipa, Macau
Leong Hou U , Department of Computer and Information Science, University of Macau, Av. Padre Tomás Pereira Taipa, Macau
Man Lung Yiu , Department of Computing, Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Zhiguo Gong , Department of Computer and Information Science, University of Macau, Av. Padre Tomás Pereira Taipa, Macau
ABSTRACT
Discovering motifs in sequence databases has been receiving abundant attentions from both database and data mining communities, where the motif is the most correlated pair of subsequences in a sequence object. Motif discovery is expensive for emerging applications which may have very long sequences (e.g., million observations per sequence) or the queries arrive rapidly (e.g., per 10 seconds). Prior works cannot offer fast correlation computations and prune subsequence pairs at the same time, as these two techniques require different orderings on examining subsequence pairs. In this work, we propose a novel framework named Quick-Motif which adopts a two-level approach to enable batch pruning at the outer level and enable fast correlation calculation at the inner level. We further propose two optimization techniques for the outer and the inner level. In our experimental study, our method is up to 3 orders of magnitude faster than the state-of-the-art methods.
INDEX TERMS
Force, Silicon
CITATION

Y. Li, L. H. U, M. L. Yiu and Z. Gong, "Quick-motif: An efficient and scalable framework for exact motif discovery," 2015 IEEE 31st International Conference on Data Engineering (ICDE), Seoul, South Korea, 2015, pp. 579-590.
doi:10.1109/ICDE.2015.7113316
94 ms
(Ver 3.3 (11022016))