This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Graph Twiddling in a MapReduce World
July/August 2009 (vol. 11 no. 4)
pp. 29-41
Jonathan Cohen, US National Security Agency
As the size of graphs for analysis continues to grow, methods of graph processing that scale well have become increasingly important. One way to handle large datasets is to disperse them across an array of networked computers, each of which implements simple sorting and accumulating, or MapReduce, operations. This cloud computing approach offers many attractive features. If decomposing useful graph operations in terms of MapReduce cycles is possible, it provides incentive for seriously considering cloud computing. Moreover, it offers a way to handle a large graph on a single machine that can't hold the entire graph as well as enables streaming graph processing. This article examines this possibility.

1. J. Dean, and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Comm. ACM, vol. 51, no. 1, 2008, pp. 107–112.
2. GoogleDevelopers, "Lecture 5: Parallel Graph Algorithms with MapReduce,"28 Aug. 2007; http://youtube.comwatch?v=BT-piFBP4fE.
3. J.L. Gross and J. Yellen, Handbook of Graph Theory, CRC Press, 2004.
4. J.D. Cohen, "Trusses: Cohesive Subgraphs for Social Network Analysis," 2008; http://www2.computer.org/cms/Computer.org/ dl/mags/cs/2009/04/extrasmsp2009040029s1.pdf .
5. J.D. Cohen, "Barycentric Graph Clustering," 2008; http://www2.computer.org/cms/Computer.org/ dl/mags/cs/2009/04/extrasmsp2009040029s2.pdf .

Index Terms:
graphs, networks, cloud computing, MapReduce, Hadoop, social network analysis, trusses, cycles, clustering
Citation:
Jonathan Cohen, "Graph Twiddling in a MapReduce World," Computing in Science and Engineering, vol. 11, no. 4, pp. 29-41, July-Aug. 2009, doi:10.1109/MCSE.2009.120
Usage of this product signifies your acceptance of the Terms of Use.