2007 IEEE 13th International Symposium on High Performance Computer Architecture
Evaluating MapReduce for Multi-core and Multiprocessor Systems
Scottsdale, AZ, USA
February 10-February 14
ISBN: 1-4244-0804-0
Colby Ranger, Computer Systems Laboratory, Stanford University. Email: cranger@stanford.edu
Ramanan Raghuraman, Computer Systems Laboratory, Stanford University. Email: ramananr@stanford.edu
Arun Penmetsa, Computer Systems Laboratory, Stanford University. Email: penmetsa@stanford.edu
Gary Bradski, Computer Systems Laboratory, Stanford University. Email: garybradski@gmail.com
Christos Kozyrakis, Computer Systems Laboratory, Stanford University. Email: christos@ee.stanford.edu.
This paper evaluates the suitability of the MapReduce model for multi-core and multi-processor systems. MapReduce was created by Google for application development on data-centers with thousands of servers. It allows programmers to write functional-style code that is automaticatlly parallelized and scheduled in a distributed system. We describe Phoenix, an implementation of MapReduce for shared-memory systems that includes a programming API and an efficient runtime system. The Phoenix run-time automatically manages thread creation, dynamic task scheduling, data partitioning, and fault tolerance across processor nodes. We study Phoenix with multi-core and symmetric multiprocessor systems and evaluate its performance potential and error recovery features. We also compare MapReduce code to code written in lower-level APIs such as P-threads. Overall, we establish that, given a careful implementation, MapReduce is a promising model for scalable performance on shared-memory systems with simple parallel code.
Citation:
Colby Ranger, Ramanan Raghuraman, Arun Penmetsa, Gary Bradski, Christos Kozyrakis, "Evaluating MapReduce for Multi-core and Multiprocessor Systems," hpca, pp.13-24, 2007 IEEE 13th International Symposium on High Performance Computer Architecture, 2007