The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.04 - July/August (2010 vol.30)
pp: 55-64
Peng Wang , Institute of Computing Technology, Chinese Academy of Sciences
Dan Meng , Institute of Computing Technology, Chinese Academy of Sciences
Jizhong Han , Institute of Computing Technology, Chinese Academy of Sciences
Jianfeng Zhan , Institute of Computing Technology, Chinese Academy of Sciences
Bibo Tu , Institute of Computing Technology, Chinese Academy of Sciences
Xiaofeng Shi , Tencent Corporation
Le Wan , Tencent Corporation
ABSTRACT
<p>Cloud computing drives the design and development of diverse programming models for massive data processing. The Transformer programming framework aims to facilitate the building of diverse data-parallel programming models. Transformer has two layers: a common runtime system and a model-specific system. Using Transformer, the authors show how to implement three programming models: Dryad-like data flow, MapReduce, and All-Pairs.</p>
INDEX TERMS
cloud computing, data intensive computing, programming model, data flow, MapReduce, All-Pairs, actor model
CITATION
Peng Wang, Dan Meng, Jizhong Han, Jianfeng Zhan, Bibo Tu, Xiaofeng Shi, Le Wan, "Transformer: A New Paradigm for Building Data-Parallel Programming Models", IEEE Micro, vol.30, no. 4, pp. 55-64, July/August 2010, doi:10.1109/MM.2010.75
REFERENCES
1. J. Dean and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," Proc. 6th Conf. Symp. Operating Systems Design & Implementation, Usenix Assoc. Press, vol. 6, 2004, pp. 137-150.
2. M. Isard et al., "Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks," ACM SIGOPS Operating Systems Rev., vol. 41, no. 3, 2007, pp. 59-72.
3. B. Hindman et al., "A Common Substrate for Cluster Computing," Proc. HotCloud Workshop Hot Topics in Cloud Computing, Usenix Assoc. Press, 2009, pp. 91-95.
4. C. Moretti et al., "All-Pairs: An Abstraction for Data-Intensive Cloud Computing," Proc. Int'l Parallel and Distributed Processing Symp., IEEE Press, 2008, pp. 1-11.
5. G. Malewicz et al., "Pregel: A System for Large-Scale Graph Processing," Proc. 28th ACM Symp. Principles of Distributed Computing, ACM Press, 2009, p. 6.
6. R. Pike et al., "Interpreting the Data: Parallel Analysis with Sawzall," Scientific Programming, vol. 13, no. 4, 2005, pp. 277-298.
7. C. Olston et al., "Pig Latin: A Not-So-Foreign Language for Data Processing," Proc. 2008 ACM SIGMOD Int'l Conf. Management of Data, ACM Press, 2008, pp. 1099-1110.
8. A. Thusoo et al., "Hive—Warehousing Solution over a Map-Reduce Framework," Proc. Int'l Conf. Very Large Data Bases (VLDB), vol. 2, no. 2, VLDB Endowment, 2009, pp. 1626-1629.
9. R. Chaiken et al., "Scope: Easy and Efficient Parallel Processing of Massive Datasets," Proc. Int'l Conf. Very Large Data Bases (VLDB), vol. 1, no. 2, VLDB Endowment, 2008, pp. 1265-1276.
10. Y. Yu et al., "DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language," Proc. 8th Symp. Operating Systems Design and Implementation, Usenix Assoc. Press, 2008, pp. 1-14.
11. D. A. Patterson, "Technical Perspective: The Data Center is the Computer," Comm. ACM, vol. 51, no. 1, 2008, p. 105.
12. L.A. Barroso and U. Hölzle, The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines, Morgan & Claypool Publishers, 2009.
13. W.M. Johnston, J.R.P. Hanna, and R.J. Millar, "Advances in Dataflow Programming Languages," ACM Computing Surveys, vol. 36, no. 1, 2004, pp. 1-34.
14. G. Agha, Actors: A Model of Concurrent Computation in Distributed Systems, MIT Press, 1986.
18 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool