2013 International Conference on Computing, Networking and Communications (ICNC) (2012)
Okinawa, Japan Japan
Dec. 5, 2012 to Dec. 7, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICNC.2012.53
Hadoop, consists of Hadoop MapReduce and Hadoop Distributed File System (HDFS), is a platform for large scale data and processing. Distributed processing has become common as the number of data has been increasing rapidly worldwide and the scale of processes has become larger, so that Hadoop has attracted many cloud computing enterprises and technology enthusiasts. Hadoop users are expanding under this situation. Our studies are to develop the faster of executing jobs originated by Hadoop. In this paper, we propose dynamic processing slots scheduling for I/O intensive jobs of Hadoop MapReduce focusing on I/O wait during execution of jobs. Assigning more tasks to added free slots when CPU resources with the high rate of I/O wait have been detected on each active Task Tracker node leads to the improvement of CPU performance. We implemented our method on Hadoop 1.0.3, which results in an improvement of up to about 23% in the execution time.
Slots Scheduling, Hadoop, MapReduce, Scheduling algorithm
Shiori Kurazumi, Tomoaki Tsumura, Shoichi Saito, Hiroshi Matsuo, "Dynamic Processing Slots Scheduling for I/O Intensive Jobs of Hadoop MapReduce", 2013 International Conference on Computing, Networking and Communications (ICNC), vol. 00, no. , pp. 288-292, 2012, doi:10.1109/ICNC.2012.53