|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| George Kollios, Michalis Potamias, Evimaria Terzi, "Clustering Large Probabilistic Graphs," IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 2, pp. 325-336, Feb., 2013. | |||
| BibTex | x | ||
| @article{ 10.1109/TKDE.2011.243, author = {George Kollios and Michalis Potamias and Evimaria Terzi}, title = {Clustering Large Probabilistic Graphs}, journal ={IEEE Transactions on Knowledge and Data Engineering}, volume = {25}, number = {2}, issn = {1041-4347}, year = {2013}, pages = {325-336}, doi = {http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.243}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Knowledge and Data Engineering TI - Clustering Large Probabilistic Graphs IS - 2 SN - 1041-4347 SP325 EP336 EPD - 325-336 A1 - George Kollios, A1 - Michalis Potamias, A1 - Evimaria Terzi, PY - 2013 KW - Probabilistic logic KW - Clustering algorithms KW - Approximation algorithms KW - Partitioning algorithms KW - Proteins KW - Data mining KW - Approximation methods KW - probabilistic databases KW - Uncertain data KW - probabilistic graphs KW - clustering algorithms VL - 25 JA - IEEE Transactions on Knowledge and Data Engineering ER - | |||
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.243
We study the problem of clustering probabilistic graphs. Similar to the problem of clustering standard graphs, probabilistic graph clustering has numerous applications, such as finding complexes in probabilistic protein-protein interaction (PPI) networks and discovering groups of users in affiliation networks. We extend the edit-distance-based definition of graph clustering to probabilistic graphs. We establish a connection between our objective function and correlation clustering to propose practical approximation algorithms for our problem. A benefit of our approach is that our objective function is parameter-free. Therefore, the number of clusters is part of the output. We also develop methods for testing the statistical significance of the output clustering and study the case of noisy clusterings. Using a real protein-protein interaction network and ground-truth data, we show that our methods discover the correct number of clusters and identify established protein relationships. Finally, we show the practicality of our techniques using a large social network of Yahoo! users consisting of one billion edges.
Index Terms:
Probabilistic logic,Clustering algorithms,Approximation algorithms,Partitioning algorithms,Proteins,Data mining,Approximation methods,probabilistic databases,Uncertain data,probabilistic graphs,clustering algorithms
Citation:
George Kollios, Michalis Potamias, Evimaria Terzi, "Clustering Large Probabilistic Graphs," IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 2, pp. 325-336, Feb. 2013, doi:10.1109/TKDE.2011.243
Usage of this product signifies your acceptance of the Terms of Use.

