2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) (2018)
Washington, DC, USA
May 1, 2018 to May 4, 2018
Graphlet counting is a methodology for detecting local structural properties of large graphs that has been in use for over a decade. Despite tremendous effort in optimizing its performance, even 3- and 4-node graphlet counting routines may run for hours or days on highly optimized systems. In this paper, we describe how a synergistic combination of approximate computing with parallel computing can result in multiplicative performance improvements in graphlet counting runtimes with minimal and controllable loss of accuracy. Specifically, we describe two novel techniques, multi-phased sampling for statistical accuracy guarantees and cost-aware sampling to further improve performance on multi-machine runs, which reduce the query time on large graphs from tens of hours to several minutes or seconds with only <;1% relative error.
graph theory, network theory (graphs), optimisation
D. Mawhirter, B. Wu, D. Mehta and C. Ai, "ApproxG: Fast Approximate Parallel Graphlet Counting Through Accuracy Control," 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID), Washington, DC, USA, 2018, pp. 533-542.