2011 IEEE 11th International Conference on Data Mining Workshops (2011)

Vancouver, Canada

Dec. 11, 2011 to Dec. 11, 2011

ISBN: 978-0-7695-4409-0

pp: 180-187

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2011.84

ABSTRACT

Understanding how nodes interconnect in large graphs is an important problem in many fields. We wish to find connecting nodes between two nodes or two groups of source nodes. In order to find these connecting nodes in huge graphs, we have devised a highly parallelized variant of a k-shortest path algorithm that levies the power of the Hadoop distributed computing system and HBase distributed key/value store. We show how our system enables previously unobtainable graph analysis by finding these connecting nodes in graphs as large as one billion nodes or more on modest commodity hardware in a time frame of just minutes.

INDEX TERMS

hadoop, distributed computing, shortest paths, bfs, algorithm

CITATION

A. Rahman, A. Levine, C. McCubbin and B. Perozzi, "Finding the 'Needle': Locating Interesting Nodes Using the K-shortest Paths Algorithm in MapReduce,"

*2011 IEEE 11th International Conference on Data Mining Workshops(ICDMW)*, Vancouver, Canada, 2011, pp. 180-187.

doi:10.1109/ICDMW.2011.84

CITATIONS

SEARCH