|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Odysseas Papapetrou, Wolf Siberski, Norbert Fuhr, "Decentralized Probabilistic Text Clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 10, pp. 1848-1861, Oct., 2012. | |||
| BibTex | x | ||
| @article{ 10.1109/TKDE.2011.120, author = {Odysseas Papapetrou and Wolf Siberski and Norbert Fuhr}, title = {Decentralized Probabilistic Text Clustering}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {24}, number = {10}, issn = {1041-4347}, year = {2012}, pages = {1848-1861}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.120}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Knowledge and Data Engineering TI - Decentralized Probabilistic Text Clustering IS - 10 SN - 1041-4347 SP1848 EP1861 EPD - 1848-1861 A1 - Odysseas Papapetrou, A1 - Wolf Siberski, A1 - Norbert Fuhr, PY - 2012 KW - Clustering algorithms KW - Peer to peer computing KW - Probabilistic logic KW - Frequency estimation KW - Indexing KW - Computational modeling KW - text clustering. KW - Distributed clustering VL - 24 JA - IEEE Transactions on Knowledge and Data Engineering ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.120
Text clustering is an established technique for improving quality in information retrieval, for both centralized and distributed environments. However, traditional text clustering algorithms fail to scale on highly distributed environments, such as peer-to-peer networks. Our algorithm for peer-to-peer clustering achieves high scalability by using a probabilistic approach for assigning documents to clusters. It enables a peer to compare each of its documents only with very few selected clusters, without significant loss of clustering quality. The algorithm offers probabilistic guarantees for the correctness of each document assignment to a cluster. Extensive experimental evaluation with up to 1 million peers and 1 million documents demonstrates the scalability and effectiveness of the algorithm.
Index Terms:
Clustering algorithms,Peer to peer computing,Probabilistic logic,Frequency estimation,Indexing,Computational modeling,text clustering.,Distributed clustering
Citation:
Odysseas Papapetrou, Wolf Siberski, Norbert Fuhr, "Decentralized Probabilistic Text Clustering," IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 10, pp. 1848-1861, Oct. 2012, doi:10.1109/TKDE.2011.120
Usage of this product signifies your acceptance of the Terms of Use.

