2016 IEEE 32nd International Conference on Data Engineering (ICDE) (2016)
May 16, 2016 to May 20, 2016
Shangfu Peng , University of Maryland, United States
Jagan Sankaranarayanan , NEC Labs America, United States
Hanan Samet , University of Maryland, United States
In the past decades, shortest distance methods for road networks have been developed that focus on how to speed up the latency of a single source-target pair distance query. Large analytical applications on road networks including simulations (e.g., evacuation planning), logistics, and transportation planning require methods that provide high throughput (i.e., distance computations per second) and the ability to "scale out" by using large distributed computing clusters. A framework called SPDO is presented which implements an extremely fast distributed algorithm for computing road network distance queries on Apache Spark. The approach extends our previous work of developing the ?-distance oracle which has now been adapted to use Spark's resilient distributed dataset (RDD). Compared with state-of-the-art methods that focus on reducing latency, the proposed framework improves the throughput by at least an order of magnitude, which makes the approach suitable for applications that need to compute thousands to millions of network distances per second.
S. Peng, J. Sankaranarayanan and H. Samet, "SPDO: High-throughput road distance computations on Spark using Distance Oracles," 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, Finland, 2016, pp. 1239-1250.