|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
2003 IEEE/WIC International Conference on Web Intelligence (WI'03)
Incremental Document Clustering Using Cluster Similarity Histograms
Halifax, Canada
October 13-October 17
ISBN: 0-7695-1932-6
| ASCII Text | x | ||
| Khaled M. Hammouda, Mohamed S. Kamel, "Incremental Document Clustering Using Cluster Similarity Histograms," Web Intelligence, IEEE / WIC / ACM International Conference on, pp. 597, 2003 IEEE/WIC International Conference on Web Intelligence (WI'03), 2003. | |||
| BibTex | x | ||
| @article{ 10.1109/WI.2003.1241276, author = {Khaled M. Hammouda and Mohamed S. Kamel}, title = {Incremental Document Clustering Using Cluster Similarity Histograms}, journal ={Web Intelligence, IEEE / WIC / ACM International Conference on}, volume = {0}, year = {2003}, isbn = {0-7695-1932-6}, pages = {597}, doi = {http://doi.ieeecomputersociety.org/10.1109/WI.2003.1241276}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - CONF JO - Web Intelligence, IEEE / WIC / ACM International Conference on TI - Incremental Document Clustering Using Cluster Similarity Histograms SN - 0-7695-1932-6 SP EP A1 - Khaled M. Hammouda, A1 - Mohamed S. Kamel, PY - 2003 KW - null VL - 0 JA - Web Intelligence, IEEE / WIC / ACM International Conference on ER - | |||
Clustering of large collections of text documents is a key process in providing a higher level of knowledge about the underlying inherent classification of the documents. Web documents, in particular, are of great interest since managing, accessing, searching, and browsing large repositories of web content requires efficient organization. Incremental clustering algorithms are always preferred to traditional clustering techniques, since they can be applied in a dynamic environment such as the Web. An incremental document clustering algorithm is introduced in this paper, which relies only on pair-wise document similarity information. Clusters are represented using a Cluster Similarity Histogram, a concise statistical representation of the distribution of similarities within each cluster, which provides a measure of cohesiveness. The measure guides the incremental clustering process. Complexity analysis and experimental results are discussed and show that the algorithm requires less computational time than standard methods while achieving a comparable or better clustering quality.
Citation:
Khaled M. Hammouda, Mohamed S. Kamel, "Incremental Document Clustering Using Cluster Similarity Histograms," wi, pp.597, 2003 IEEE/WIC International Conference on Web Intelligence (WI'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.
