|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
Halite: Fast and Scalable Multiresolution Local-Correlation Clustering
Feb. 2013 (vol. 25 no. 2)
pp. 387-401
| ASCII Text | x | ||
| Robson L.F. Cordeiro, Agma J.M. Traina, Christos Faloutsos, Caetano Traina Jr., "Halite: Fast and Scalable Multiresolution Local-Correlation Clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 2, pp. 387-401, Feb., 2013. | |||
| BibTex | x | ||
| @article{ 10.1109/TKDE.2011.176, author = {Robson L.F. Cordeiro and Agma J.M. Traina and Christos Faloutsos and Caetano Traina Jr.}, title = {Halite: Fast and Scalable Multiresolution Local-Correlation Clustering}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {25}, number = {2}, issn = {1041-4347}, year = {2013}, pages = {387-401}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.176}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Knowledge and Data Engineering TI - Halite: Fast and Scalable Multiresolution Local-Correlation Clustering IS - 2 SN - 1041-4347 SP387 EP401 EPD - 387-401 A1 - Robson L.F. Cordeiro, A1 - Agma J.M. Traina, A1 - Christos Faloutsos, A1 - Caetano Traina Jr., PY - 2013 KW - Shape KW - Correlation KW - Laplace equations KW - Convolution KW - Proposals KW - Accuracy KW - Complexity theory KW - data mining KW - Local-correlation clustering KW - moderate-to-high dimensional data VL - 25 JA - IEEE Transactions on Knowledge and Data Engineering ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.176
This paper proposes Halite, a novel, fast, and scalable clustering method that looks for clusters in subspaces of multidimensional data. Existing methods are typically superlinear in space or execution time. Halite's strengths are that it is fast and scalable, while still giving highly accurate results. Specifically the main contributions of Halite are: 1) Scalability: it is linear or quasi linear in time and space regarding the data size and dimensionality, and the dimensionality of the clusters' subspaces; 2) Usability: it is deterministic, robust to noise, doesn't take the number of clusters as an input parameter, and detects clusters in subspaces generated by original axes or by their linear combinations, including space rotation; 3) Effectiveness: it is accurate, providing results with equal or better quality compared to top related works; and 4) Generality: it includes a soft clustering approach. Experiments on synthetic data ranging from five to 30 axes and up to 1 \rm million points were performed. Halite was in average at least 12 times faster than seven representative works, and always presented highly accurate results. On real data, Halite was at least 11 times faster than others, increasing their accuracy in up to 35 percent. Finally, we report experiments in a real scenario where soft clustering is desirable.
Index Terms:
Shape,Correlation,Laplace equations,Convolution,Proposals,Accuracy,Complexity theory,data mining,Local-correlation clustering,moderate-to-high dimensional data
Citation:
Robson L.F. Cordeiro, Agma J.M. Traina, Christos Faloutsos, Caetano Traina Jr., "Halite: Fast and Scalable Multiresolution Local-Correlation Clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 2, pp. 387-401, Feb. 2013, doi:10.1109/TKDE.2011.176
Usage of this product signifies your acceptance of the Terms of Use.

