|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
18th International Conference on Data Engineering (ICDE'02)
OSSM: A Segmentation Approach to Optimize Frequency Counting
San Jose, California
February 26-March 01
ISBN: 0-7695-1531-2
| ASCII Text | x | ||
| Carson Kai-Sang Leung, Raymond T. Ng, Heikki Mannila, "OSSM: A Segmentation Approach to Optimize Frequency Counting," Data Engineering, International Conference on, pp. 0583, 18th International Conference on Data Engineering (ICDE'02), 2002. | |||
| BibTex | x | ||
| @article{ 10.1109/ICDE.2002.994776, author = {Carson Kai-Sang Leung and Raymond T. Ng and Heikki Mannila}, title = {OSSM: A Segmentation Approach to Optimize Frequency Counting}, journal ={Data Engineering, International Conference on}, volume = {0}, year = {2002}, isbn = {0-7695-1531-2}, pages = {0583}, doi = {http://doi.ieeecomputersociety.org/10.1109/ICDE.2002.994776}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Data Engineering, International Conference on TI - OSSM: A Segmentation Approach to Optimize Frequency Counting SN - 0-7695-1531-2 SP EP A1 - Carson Kai-Sang Leung, A1 - Raymond T. Ng, A1 - Heikki Mannila, PY - 2002 KW - Data mining KW - frequent patterns KW - support counting KW - data structure KW - performance analysis VL - 0 JA - Data Engineering, International Conference on ER - | |||
Computing the frequency of a pattern is one of the key operations in data mining algorithms. We describe a simple yet powerful way of speeding up any form of frequency counting satisfying the monotonicity condition. Our method, the optimized segment support map (OSSM), is a light-weight structure which partitions the collection of transactions into m segments, so as to reduce the number of candidate patterns that require frequency counting. We study the following problems: (1) What is the optimal number of segments to be used; and (2) Given a user-determined m, what is the best segmentation/composition of the m segments? For Problem 1, we provide a thorough analysis and a theorem establishing the minimum value of m for which there is no accuracy lost in using the OSSM. For Problem 2, we develop various algorithms and heuristics, which efficiently generate OSSMs that are compact and effective, to help facilitate segmentation.
Index Terms:
Data mining, frequent patterns, support counting, data structure, performance analysis
Citation:
Carson Kai-Sang Leung, Raymond T. Ng, Heikki Mannila, "OSSM: A Segmentation Approach to Optimize Frequency Counting," icde, pp.0583, 18th International Conference on Data Engineering (ICDE'02), 2002
Usage of this product signifies your acceptance of the Terms of Use.
