|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Javier Diaz, Camelia Muñoz-Caro, Alfonso Niño, "A Survey of Parallel Programming Models and Tools in the Multi and Many-Core Era," IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 8, pp. 1369-1386, Aug., 2012. | |||
| BibTex | x | ||
| @article{ 10.1109/TPDS.2011.308, author = {Javier Diaz and Camelia Muñoz-Caro and Alfonso Niño}, title = {A Survey of Parallel Programming Models and Tools in the Multi and Many-Core Era}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {23}, number = {8}, issn = {1045-9219}, year = {2012}, pages = {1369-1386}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPDS.2011.308}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - A Survey of Parallel Programming Models and Tools in the Multi and Many-Core Era IS - 8 SN - 1045-9219 SP1369 EP1386 EPD - 1369-1386 A1 - Javier Diaz, A1 - Camelia Muñoz-Caro, A1 - Alfonso Niño, PY - 2012 KW - Parallelism and concurrency KW - distributed programming KW - heterogeneous (hybrid) systems. VL - 23 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
[1] D. Kirk and W. Hwu, Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann, 2010.
[2] W. Hwu, K. Keutzer, and T.G. Mattson, "The concurrency challenge," IEEE Design and Test of Computers, vol. 25, no. 4, pp. 312-320, July 2008.
[3] H. Sutter and J. Larus, "Software and the Concurrency Revolution," ACM Queue, vol. 3, no. 7, pp. 54-62, 2005.
[4] W-C. Feng and P. Balaji, "Tools and Environments for Multicore and Many-Core Architectures," Computer, vol. 42, no. 12, pp. 26-27, Dec. 2009.
[5] R.R. Loka, W-C. Feng, and P. Balaji, "Serial Computing Is Not Dead," Computer, vol. 43, no. 9, pp. 6-9, Mar. 2010.
[6] J. Dongarra, I. Foster, G. Fox, W. Gropp, K. Kennedy, L. Torczon, and A. White, The Sourcebook of Parallel Computing. Morgan Kaufmann Publishers, 2003.
[7] H. Kasim, V. March, R. Zhang, and S. See, "Survey on Parallel Programming Model," Proc. IFIP Int'l Conf. Network and Parallel Computing, vol. 5245, pp. 266-275, Oct. 2008.
[8] M.J. Sottile, T.G. Mattson, and C.E. Rasmussen, Introduction to Concurrency in Programming Languages. CRC Press, 2010.
[9] G.R. Andrews, Foundations of Multithreaded, Parallel, and Distributed Programming. Addison Wesley, 1999.
[10] T.G. Mattson, B.A. Sanders, and B. Massingill, Patterns for Parallel Programming. Addison-Wesley Professional, 2005.
[11] V. Kumar, A. Grama, A. Gupta, and G. Karypis, Introduction to Parallel Computing: Design and Analysis of Algorithms. Benjamin/Cummings Publishing Company, 1994.
[12] G. Fox, M. Johnson, G. Lyzenga, S. Otto, J. Salmon, and D. Walker, Solving Problems on Concurrent Processors, vol. 1. Prentice Hall, 1988.
[13] M.J. Quinn, Parallel Computing: Theory and Practice. McGraw-Hill, 1994.
[14] P.B. Hansen, Studies in Computational Science: Parallel Programming Paradigms. Prentice-Hall, 1995.
[15] K.M. Chandy and J. Misra, Parallel Program Design: A Foundation. Addison-Wesley, 1988.
[16] OpenMP, "API Specification for Parallel Programming," http://openmp.org/wpopenmp-specifications , Oct. 2011.
[17] W. Gropp, S. Huss-Lederman, A. Lumsdaine, E. Lusk, B. Nitzberg, W. Saphir, and M. Snir, MPI: The Complete Reference, the MPI-2 Extensions, vol. 2. The MIT Press, Sept. 1998.
[18] K. Kedia, "Hybrid Programming with OpenMP and MPI," Technical Report 18.337J, Massachusetts Inst. of Tech nology, May 2009.
[19] D.A. Jacobsen, J.C. Thibaulty, and I. Senocak, "An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters," Proc. 48th AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, Jan. 2010.
[20] C.-T. Yang, C.-L. Huang, and C.-F. Li, "Hybrid CUDA, OpenMP, and MPI Parallel Programming on Multicore GPU Clusters," Computer Physics Comm., vol. 182, no. 1, 2011.
[21] POSIX 1003.1 FAQ, http://www.opengroup.org/austin/papers\posix_faq.html , Oct. 2011.
[22] D.R. Butenhof, Programming with POSIX Threads. Addison-Wesley, 1997.
[23] IEEE, "IEEE P1003.1c/D10: Draft Standard for Information Technology - Portable Operating Systems Interface (POSIX)," Sept. 1994.
[24] A. Grama, G. Karypis, V. Kumar, and A. Gupta, Introduction to Parallel Computing, second ed. Addison-Wesley, 2003.
[25] B. Chapman, G. Jost, and R. van der Pas, Using, OpenMP: Portable Shared Memory Parallel Programming. MIT Press, 2007.
[26] OpenMP 3.0 Specification, http://www.openmp.org/mp- documentsspec30.pdf , Oct. 2011.
[27] P.S. Pacheco, Parallel Programming with MPI. Morgan Kaufmann, 1996.
[28] W. Gropp, E. Lusk, and A. Skjellum, Using MPI: Portable Parallel Programming with the Message-Passing Interface, second ed. MIT Press, 1999.
[29] W. Gropp, E. Lusk, and R. Thakur, Using MPI-2: Advanced Features of the Message-Passing Interface. MIT Press, 1999.
[30] Globus, http:/www.globus.org, Oct. 2011.
[31] Message Passing Interface Forum, "MPI-2: Extensions to the Message-Passing Interface," July 1997.
[32] W. Gropp and R. Thakur, "Thread Safety in an MPI Implementation: Requirements and Analysis," Parallel Computing, vol. 33, no. 9, pp. 595-604, Sept. 2007.
[33] M. Valiev, E.J. Bylaska, N. Govind, K. Kowalski, T.P. Straatsma, H.J.J. van Dam, D. Wang, J. Nieplocha, E. Apra, T.L. Windus, and W.A. de Jong, "NWChem: A Comprehensive and Scalable Open-Source Solution for Large Scale Molecular Simulations," Computer Physics Comm. vol. 181, pp. 1477-1589, http:/www.nwchem-sw. org, Oct. 2011.
[34] M.S. Gordon and M.W. Schmidt, "Advances in electronic structure theory: GAMESS a decade later," Theory and Applications of Computational Chemistry, the First Forty Years, C.E. Dykstra, G. Frenking, K.S. Kim, G.E. Scuseria, eds., Chapter 41, pp 1167-1189, Elsevier, http://www.msg.chem.iastate.edu/GAMESS GAMESS.html , Oct. 2011.
[35] H. Lin, X. Ma, W. Feng, and N. Samatova, "Coordinating Computation and I/O in Massively Parallel Sequence Search," IEEE Trans. Parallel and Distributed Systems, vol. 22, no. 4, pp. 529-543, http:/www.mpiblast.org, Oct. 2011.
[36] M. Macedonia, "The GPU Enters Computing's Mainstream," Computer, vol. 36, no.10, pp. 106-108, Oct. 2003.
[37] AMD Fusion, http://sites.amd.com/us/fusion/apu/Pages fusion.aspx, Oct. 2011.
[38] Sandy Bridge, http://software.intel.com/en-us/articles sandy-bridge/, Oct. 2011.
[39] I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan, "Brook for GPUs: Stream Computing on Graphics Hardware," Proc. SIGGRAPH, 2004.
[40] W.R. Mark, R.S. Glanville, K. Akeley, M.J. Kilgard, "Cg: A System for Programming Graphics Hardware in a C-Like Language," Proc. SIGGRAPH, 2003.
[41] CUDA Zone, http://www.nvidia.com/objectcuda_home_ new.html , Oct. 2011.
[42] Khronos Group, http://www.khronos.orgopencl, Oct. 2011.
[43] Microsoft DirectX Developer Center,http://msdn.microsoft.com/en-us/directxdefault , Oct. 2011.
[44] Sophisticated Library for Vector Parallelism: Intel Array Building Blocks, Intel; http://software.intel.com/en-us/articles intel- array-building-blocks, 2010.
[45] Nvidia Developer Zone, http:/developer.nvidia.com, Oct. 2011.
[46] Nvidia Company. Nvidia CUDA Programming Guide, v3.0, 2010.
[47] Nvidia Company. Nvidia CUDA C Programming Best Practices Guide, Version 3.0, 2010.
[48] Michael Wolfe, "Compilers and More: Knights Ferry Versus Fermi," HPCwire, Aug. 2010.
[49] K. Skaugen, "Petascale to Exascale. Extending Intel's HPC Commitment"," Proc. Int'l Supercomputing Conf. (ISC '10), 2010.
[50] OpenCL 1.1 Specification, http://www.khronos.org/registry/cl/specs \opencl-1.1.pdf, Oct. 2011.
[51] Introduction to OpenCL, http://www.amd.com/us/products/technolo-gies/ stream-technology/opencl/pagesopencl-intro. aspx , Oct. 2011.
[52] W.D. Hillis and G.L. Steele, "Data Parallel Algorithms," Comm. ACM, vol. 29, pp. 1170-1183, 1986.
[53] M. Quinn, Parallel Programming in C with MPI and OpenMP. McGraw-Hill, 2004.
[54] OpenCL 1.1 C++ Bindings Specification, http://www.khronos. org/\ registry/cl/specs opencl-cplusplus-1.1.pdf, Oct. 2011.
[55] Shader Model 5 (Microsoft MSDN), http://msdn.microsoft.com/en-us/libraryff471356(v = vs.85).aspx , Oct. 2011.
[56] A. Ghuloum et al., "Future-Proof Data Parallel Algorithms and Software on Intel Multi-Core Architecture," Intel Technology J., vol. 11, no. 4, pp. 333-347, 2007.
[57] W. Kim, M. Voss, "Multicore Desktop Programming with Intel Threading Building Blocks," IEEE Software, vol. 28, no. 1, pp. 23-31, Jan./Feb. 2011.
[58] Intel Threading Building Blocks, http:/www. threadingbuildingblocks.org. Oct. 2011.
[59] J. Krüger and R. Westermann, "Linear Algebra Operators for GPU Implementation of Numerical Algorithms," ACM Trans. Graphics, vol. 22, pp. 908-916, 2003.
[60] D.C. Rapaport, "Enhanced Molecular Dynamics Performance with a Programmable Graphics Processor," Computer Physics Comm. vol. 182, pp. 926-934, 2011.
[61] F. Xu and K. Mueller, "Accelerating Popular Tomographic Reconstruction Algorithms on Commodity PC Graphics Hardware," IEEE Trans. Nuclear Science, vol. 52, pp. 654-663, June 2005.
[62] S.A. Manavski and G. Valle, "CUDA Compatible GPU Cards as Efficient Hardware Accelerators for Smith-Waterman Sequence Alignment," BMC Bioinformatics, vol. 9, no. 2, pp. 1-9, Mar. 2008.
[63] J.D. Owens, D. Luebke, N. Govindaraju, M. Harris, J. Krüger, A.E. Lefohn, and T.J. Purcel, "A Survey of General-Purpose Computation on Graphics Hardware," Computer Graphics Forum, vol. 26, no. 1, pp. 80-113, 2007.
[64] J.D. Owens, M. Houston, D. Luebke, S. Green, J.E. Stone, and J.C. Phillips, "GPU Computing," Proc. IEEE, vol. 96, no. 5, pp. 879-899, May 2008.
[65] J. Protic, M. Tomasevic, and V. Milotinuvic, "A survey of distributed shared memory systems," Proc. 28th Hawaii Int'l Conf. System Sciences (HICSS '95), pp. 74-84, 1990.
[66] C. Coarfa, Y. Dotsenko, J. Mellor-Crummey, F. Cantonnet, T. El-Ghazawi, A. Mohanty, and Y. Yao, "An Evaluation of Global Address Space Languages: Co-Array Fortran and Unified Parallel C," Proc. 10th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming, pp. 36-47, 2005.
[67] V. Saraswat, G. Almasi, G. Bikshandi, C. Cascaval, D. Grove, D. Cunningham, O. Tardieu, I. Peshansky, and S. Kodali, "The Asynchronous Partitioned Global Address Space Model," Proc. First Workshop Advances in Message Passing, 2010.
[68] DARPA's, http://www.darpa.mil/Our_Work/MTO/Programs High_ Productivity_Computing_Systems_(HPCS).aspx , Oct. 2011.
[69] D. Bonachea and J. Jeong, "GASNet: A Portable High-Performance Communication Layer for Global Address-Space Languages," CS258 Parallel Computer Architecture Project, 2002.
[70] GASNet, http:/gasnet.cs.berkeley.edu./, Oct. 2011.
[71] A. Mainwaring and D. Culler, "Active Messages: Organization and Applications Programming Interface," technical report, UC Berkeley, 1995.
[72] J. Nieplocha and B. Carpenter, "ARMCI: A Portable Remote Memory Copy Library for Distributed Array Libraries and Compiler Run-Time Systems," Proc. Third Workshop Runtime Systems for Parallel Programming (RTSPP) of IPPS/SPDP '99, 1999.
[73] J. Nieplocha, V. Tipparaju, M. Krishnan, and D. Panda, "High Performance Remote Memory Access Comunications: The ARMCI Approach," Int'l J. High Performance Computing and Applications, vol. 20, pp. 233-253, 2006.
[74] Aggregate Remote Memory Copy Interface, http://www. emsl.pnl.gov/docs/parsoftarmci , Oct. 2011.
[75] The KeLP Programming System, http://cseweb.ucsd.edu/groups/hpcl/scgkelp , Oct. 2011.
[76] S.J. Fink, S.R. Kohn, and S.B. Baden, "Efficient Run-Time Support for Irregular Block-Structured Applications," J. Parallel and Distributed Computing, vol. 50, pp. 61-82, 1998.
[77] W. Carlson, J. Draper, D. Culler, K. Yelick, E. Brooks, and K. Warren, "Introduction to UPC and Language Specification," Technical Report CCS-TR-99-157, IDA Center for Computing Sciences, 1999.
[78] T. El-Ghazawi, W. Carlson, T. Sterling, and K. Yelick, UPC: Distributed Shared Memory Programming. John Wiley and Sons, 2005.
[79] Unified Parallel C, http:/upc.gwu.edu, Oct. 2011.
[80] R.W. Numrich and J.K. Reid, "Co-Arrays in the Next Fortran Standard," ACM SIGPLAN Fortran Forum, vol. 24, pp. 4-17, 2005.
[81] Co-Array Fortran, http:/www.co-array.org. Apr. 2011.
[82] http://www.nag.co.uksc22wg5/, Oct. 2011.
[83] J. Reid, "Coarrays in the Next Fortran Standard," ACM SIGPLAN Fortran Forum, vol. 29, no. 2, pp. 10-27, 2010.
[84] Titanium, http:/titanium.cs.berkeley.edu, Oct. 2011.
[85] K.A. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Liblit, A. Krishnamurthy, P.N. Hilfinger, S.L. Graham, D. Gay, P. Colella, and A. Aiken, "Titanium: A High-Performance Java Dialect," Proc. ACM Workshop Java for High-Performance Network Computing, 1998.
[86] X10 Language, http:/x10.codehaus.org, Oct. 2011.
[87] J. Muttersbach, T. Villiger, and W. Fichtner, "Practical Design of Globally-Asynchronous Locally-Synchronous Systems," Proc. Sixth Int'l Symp. Advanced Research in Asynchronous Circuits and Systems (ASYNC '00), pp. 52-59, 2000.
[88] M. Weiland, "Chapel, Fortress and x10: Novel Languages for hpc," technical report from the HPCx Consortium, 2007.
[89] Chapel Language, http:/chapel.cray.com. Oct. 2011.
[90] D. Callahan, B.L. Chamberlain, and H.P. Zima, "The Cascade High Productivity Language," Proc. Ninth Int'l Workshop High-Level Parallel Programming Models and Supportive Environments (HIPS), pp. 52-60, 2004.
[91] Project Fortress, http:/projectfortress.java.net, Oct. 2011.
[92] G. Steele, "Fortress: A New Programming Language for Scientific Computing," Sun Labs Open House, 2005.
[93] T. Sterling, P. Messina, and P.H. Smith, Enabling Technologies for Petaflops Computing. MIT Press, 1995.
[94] C. Wright, "Hybrid Programming Fun: Making Bzip2 Parallel with MPICH2 & pthreads on the Cray XD1," Proc. CUG, 2006.
[95] P. Johnson, "Pthread Performance in an MPI Model for Prime Number Generation," CSCI 4576 - High-Performance Scientific Computing, Univ. of Colorado, 2007.
[96] W. Pfeiffer and A. Stamatakis, "Hybrid MPI/Pthreads Parallelization of the RAxML Phylogenetics Code," Proc. Ninth IEEE Int'l Workshop High Performance Computational Biology, Apr. 2010.
[97] L. Smith and M. Bulk, "Development of Mixed Mode MPI/OpenMP Applications," Proc. Workshop OpenMP Applications and Tools (WOMPAT '00), July 2000.
[98] R. Rabenseifner, "Hybrid Parallel Programming on HPC Platforms," Proc. European Workshop OpenMP (EWOMP '03), 2003.
[99] B. Estrade, "Hybrid Programming with MPI and OpenMP," Proc. High Performance Computing Workshop, 2009.
[100] S. Bova, C. Breshears, R. Eigenmann, H. Gabb, G. Gaertner, B. Kuhn, B. Magro, S. Salvini, and V. Vatsa, "Combining Message-Passing and Directives in Parallel Applications," SIAM News, vol. 32, no. 9, pp. 10-14, 1999.
[101] I.J. Bush, C.J. Noble, and R.J. Allan, "Mixed OpenMP and MPI for Parallel Fortran Applications," Proc. Second European Workshop OpenMP, 2000.
[102] P. Luong, C.P. Breshears, and L.N. Ly, "Costal Ocean Modeling of the U.S. West Coast with Multiblock Grid and Dual-Level Parallelism," Proc. ACM/IEEE Conf. Supercomputing'01, 2001.
[103] R.D. Loft, S.J. Thomas, and J.M. Dennis, "Terascale Spectral Element Dynamical Core for Atmospheric General Circulation Models," Proc. ACM/IEEE Conf. Supercomputing'01, 2001.
[104] K. Nakajima, "Parallel Iterative Solvers for Finite-Element Methods Using an OpenMP/MPI hybrid Programming Model on the Earth Simulator," Parallel Computing, vol. 31, pp. 1048-1065, 2005.
[105] R. Aversa, B. Di Martino, M. Rak, S. Venticinque, and U. Villano, "Performance Prediction through Simulation of a Hybrid MPI/OpenMP Application," Parallel Computing, vol. 31, pp. 1013-1033, 2005.
[106] F. Cappello and D. Etiemble, "MPI Versus MPI+OpenMP on the IBM SP for the NAS Benchmarks," Proc. Conf. High Performance Networking and Computing, 2000.
[107] J. Duthie, M. Bull, A. Trew, and L. Smith, "Mixed Mode Applications on HPCx," Technical Report HPCxTR0403, HPCx Consortium, 2004.
[108] L. Smith, "Mixed mode MPI/OpenMP programming," Technical Report Technology Watch 1, UK High-End Computing, EPCC, United Kingdom, 2000.
[109] D.S. Henty, "Performance of hybrid Message-Passing and Shared-Memory Parallelism for Discrete Element Modeling," Proc. ACM/IEEE Conf. Supercomputing'00, 2000.
[110] E. Chow and D. Hysom, "Assessing Performance of Hybrid MPI/OpenMP Programs on SMP Clusters," Technical Report UCRL-JC-143957, Lawrence Livermore Nat'l Laboratory 2001.
[111] J.C. Thibault and I. Senocak, "CUDA Implementation of a Navier-Stokes Solver on Multi-GPU Desktop Platforms for Incompressible Flows," Proc. 47th AIAA Aerospace Sciences Meeting, 2010.
[112] S. Jun Park and D. Shires, "Central Processing Unit/Graphics Processing Unit (CPU/GPU) Hybrid Computing of Synthetic Aperture Radar Algorithm," Technical Report ARL-TR-5074, US Army Research Laboratory, 2010.
[113] H. Jang, A. Park, and K. Jung, "Neural Network Implementation using CUDA and OpenMP," Proc. Digital Image Computing: Techniques and Applications, pp. 155-161, 2008.
[114] G. Sims, "Parallel Cloth Simulation Using OpenMP and CUDA," thesis dissertation, Graduate Faculty of the Louisiana State Univ. and Agricultural and Mechanical College, 2009.
[115] Y. Wang, Z. Feng, H. Guo, C. He, and Y. Yang, "Scene Recognition Acceleration using CUDA and OpenMP," Proc. First Int'l Conf. Information Science and Eng. (ICISE '09), 2009.
[116] Q. Chen and J. Zhang, "A Stream Processor Cluster Architecture Model with the Hybrid Technology of MPI and CUDA," Proc. First Int'l Conf. Information Science and Eng. (ICISE '09), 2009.
[117] J.C. Phillips, J.E. Stone, and K. Schulten, "Adapting a Message-Driven Parallel Application to GPU-Accelerated Clusters," Proc. ACM/IEEE Conf. Supercomputing, 2008.
[118] H. Schivea, C. Chiena, S. Wonga, Y. Tsaia, and T. Chiueha, "Graphic-Card Cluster for Astrophysics (GraCCA) - Performance Tests," New Astronomy, vol. 13, no. 6, pp. 418-435, 2008.
[119] D.A. Jacobsen, J.C. Thibault, and I. Senocak, "An MPI-CUDA Implementation for Massively Parallel Incompressible Flow Computations on Multi-GPU Clusters," Proc. 48th AIAA Aerospace Sciences Meeting, 2010.
[120] N.P. Karunadasa and D.N. Ranasinghe, "On the Comparative Performance of Parallel Algorithms on Small GPU/CUDA Clusters," Proc. Int'l Conf. High Performance Computing, 2009.
[121] V. Strassen, "Gaussian Elimination Is Not Optimal," Numerische Mathematik, vol. 13, pp. 354-356, 1969.
[122] M.R. Hestenes and E. Stiefel, "Methods of Conjugate Gradients for Solving Linear Systems," J. Research of the Nat'l Bureau of Standards, vol. 49, no. 6, pp. 409-436, 1952.
[123] A.E. Walsh, J. Couch, and D.H. Steinberg, Java 2 Bible. Wiley Publishing, 2000.
[124] B. Amedro, V. Bodnartchouk, D. Caromel, C. Delbé, F. Huet, and G.L. Taboada, "Current State of Java for HPC," Technical Report RT-0353, INRIA, 2008.
[125] Nas Parallel Benchmarks, http://www.nas.nasa.gov/Resources/\Software npb.html, Oct. 2011.
[126] R.V. Nieuwpoort, J. Maassen, G. Wrzesinska, R. Hofman, C. Jacobs, T. Kielmann, and H.E. Bal, "Ibis: A Flexible and Efficient Java Based Grid Programming Environment," Concurrency and Computation: Practice and Experience, vol. 17, pp. 1079-1107, 2005.
[127] G.L. Taboada, J. Touriño, and R. Doallo, "Java for High Performance Computing: Assessment of Current Research and Practice," Proc. Seventh Int'l Conf. Principles and Practice of Programming in Java (PPPJ '09), pp. 30-39, 2009.
[128] A. Shafi, B. Carpenter, M. Baker, and A. Hussain, "A Comparative Study of Java and C Performance in Two Large-Scale Parallel Applications," Concurrency and Computation: Practice & Experience, vol. 15, no. 21, pp. 1882-1906, 2010.
[129] B. Blount and S. Chatterjee, "An Evaluation of Java for Numerical Computing," Scientific Programming, vol. 7, no. 2, pp. 97-110, 1999.
[130] Java Grande Forum: http://www.javagrande.org/pastglory index.html , Oct. 2011.
[131] M. Baker, B. Carpenter, S.H. Ko, and X. Li, "mpiJava: A Java Interface to MPI," Proc. First UK Workshop Java for High Performance Network Computing, 1998.
[132] A. Shafi, B. Carpenter, and M. Baker, "Nested Parallelism for Multi-Core HPC Systems Using Java," J. Parallel Distributed Computing, vol. 69, pp. 532-545, 2009.
[133] G.L. Taboada, S. Ramos, J. Touriño, and R. Doallo, "Design of Efficient Java Message-Passing Collectives on Multi-Core Clusters," J. Supercomputing, vol. 55, pp. 126-154, 2011.
[134] High Performance Fortran, http://hpff.rice.eduindex.htm. Oct. 2011.
[135] H. Richardson, "High Performance Fortran: History, Overview and Current Developments," Technical Report TMC-261, Thinking Machines Corporation, 1996.
[136] C.H.Q. Ding, "High Performance Fortran for Practical Scientific Algorithms: An Up-to-Date Evaluation," Future Generation Computer Systems, vol. 15, pp. 343-352, 1999.
[137] R.D. Blumofe, C.F. Joerg, B.C. Kuszmaul, C.E. Leiserson, K.H. Randall, and Y. Zhou, "Cilk: An Efficient Multithreaded Runtime System," J. Parallel and Distributed Computing, vol. 37, pp. 55-69, 1996.
[138] Cilk Project, http://supertech.csail.mit.educilk, Oct. 2011.
[139] Intel Cilk Plus, http://software.intel.com/en-us/articles intel-cilk-plus, Oct. 2011.
[140] B.L. Chamberlain, S.-E. Choi, E.C. Lewis, C. Lin, L. Snyder, and W.D. Weathersby, "ZPL: A Machine Independent Programming Language for Parallel Computers," IEEE Trans. Software Eng., vol. 26, no. 3, pp. 197-211, Mar. 2000.
[141] L. Snyder, "The Design and Development of ZPL," Proc. Third ACM SIGPLAN History of Programming Languages Conf., June 2007.
[142] Zpl Web: http://www.cs.washington.edu/research/zpl/ homeindex.html, Oct. 2011.
[143] H. Wu, G. Turkiyyahi, and W. Keirouzt, "ZPLCLAW: A Parallel Portable Toolkit for Wave Propagation Problems," Proc. Am. Soc. of Civil Eng. (ASCE) Structures Congress, 2000.
[144] Erlang: http:/www.erlang.org, Oct. 2011.
[145] S. Vinoski, "Reliability with Erlang," IEEE Internet Computing, vol. 11, no. 6, pp. 79-81, Nov./Dec. 2007.
[146] P.W. Trinder, K. Hammond, H.-W. Loidl, and S.L. Jones, "Algorithm+Strategy=Parallelism," J. Functional Programming, vol. 8, no. 1, pp. 23-60, 1998.
[147] S. Marlow, S.P. Jones, and S. Singh, "Runtime Support for Multicore Haskell," ACM SIGPLAN Notices - ICFP '09, vol. 44, no. 9, pp. 65-78, 2009.
[148] A.S. Tanenbaum and M.V. Steen, Distributed Systems: Principles and Paradigms, second ed. Prentice Hall, 2007.
[149] I. Foster and C. Kesselman, The Grid: Blueprint for a New Computing Infrastructure. Morgan Kauffman, 1998.
[150] B. Wilkinson, Grid Computing. Chapman & Hall/CRC, 2010.
[151] gLite, http:/glite.cern.ch, Oct. 2011.
[152] "EGEE, http:/www.eu-egee.org," Oct. 2011.
[153] S. Reyes, C. Muñoz-Caro, A. Niño, R.M. Badia, and J.M. Cela, "Performance of Computationally Intensive Parameter Sweep Applications on Internet-Based Grids of Computers: the Mapping of Molecular Potential Energy Hypersurfaces," Concurrency and Computation: Practice and Experience, vol. 19, pp. 463-481, 2007.
[154] C. Sun, B. Kim, G. Yi, and H. Park, "A Model of Problem Solving Environment for Integrated Bioinformatics Solution on Grid by Using Condor," Proc. Int'l Conf. Grid and Cooperative Computing (GCC), pp. 935-938, 2004.
[155] Large Hadron Collider (LHC) Computing Grid Project for High Energy Physics Data Analysis, http://lcg.web.cern.chLCG, Oct. 2011.
[156] OMG, http:/www.omg.org/, Oct. 2011.
[157] A. Birrell and B. Nelson, "Implementing Remote Procedure Calls," ACM Trans. Computer Systems vol. 2, no. 1, pp. 39-59, 1984.
[158] S. Vinoski, "CORBA: Integrating Diverse Applications within Distributed Heterogeneous Environments," IEEE Comm. Magazine, vol. 35, no. 2, pp. 46-55, Feb. 1997.
[159] M. Henning, "The Rise and Fall of CORBA," ACM Queue, vol. 4, pp. 28-34, June 2006.
[160] Y. Gong, "CORBA Application in Real-Time Distributed Embedded Systems," Survey Report, ECE 8990 Real-Time Systems Design, 2003.
[161] CORBA/e, http://www.corba.org/corba-eindex.htm, Oct. 2011.
[162] COM, http://www.microsoft.com/comdefault.mspx , Oct. 2011.
[163] ComSource, http://www.opengroup.orgcomsource, Oct. 2011.
[164] P. Emerald, C. Yennun, H.S. Yajnik, D. Liang, J.C. Shih, C.Y. Wang, and Y.M. Wang, "DCOM and CORBA Side by Side, Step by Step, and Layer by Layer," C++ Report, vol. 10, no. 1, pp. 18-29, 1998.
[165] G. Alonso, F. Casati, H. Kuno, and V. Machiraju, Web Services: Concepts, Architectures and Applications. Springer-Verlag, 2004.
[166] A. Gokhale, B. Kumar, and A. Sahuguet, "Reinventing the Wheel? CORBA vs. Web Services," Proc. Conf. World Wide Web (WWW '02), 2002.
[167] SOAP: http://www.w3.org/standards/techssoap#w3c_all , Apr. 2011.
[168] WSDL, http://www.w3.org/TRwsdl20/, Oct. 2011.
[169] E. Cerami, "Web Services Essentials. Distributed Applications with XML-RPC, SOAP, UDDI & WSDL, O'Reilly," 2002.
[170] http://www.oasis-open.org/committeestc_home.php?wg_ abbrev=\uddi-spec , Oct. 2011.
[171] http:/glassfish.java.net, Oct. 2011.
[172] http:/www.jboss.org, Oct. 2011.
[173] http:/geronimo.apache.org, Oct. 2011.
[174] http:/tomcat.apache.org, Oct. 2011.
[175] W.W. Eckerson, "Three Tier Client/Server Architecture: Achieving Scalability, Performance, and Efficiency in Client Server Applications," Open Information Systems, vol. 10, no. 1, 1995.
[176] www.oracle.com/technetwork/java/javaee/ejb index.html, Oct. 2011.
[177] Workflows for e-Science, I.J. Taylor, E. Deelman, D.B. Gannon, and M. Shields, eds. Springer-Verlag, 2007.
[178] EMBRACE Service Registry: www.embraceregistry.net, Oct. 2011.
[179] A. Sahai, S. Graupner, and W. Kim, "The Unfolding of the Web Services Paradigm," Technical Report HPL-2002-130, Hewlett-Packard, 2002.
[180] T. Earl, Service-Oriented Architecture: Concepts, Technology, and Design. Prentice-Hall, 2005.
[181] S. Mulik, S. Ajgaonkar, and K. Sharma, "Where Do You Want to Go in Your SOA Adoption Journey?," IT Professional, vol. 10, no. 3, pp. 36-39, May/June 2008.
[182] J. McGovern, S. Tyagi, M. Stevens, and S. Mathew, "Service Oriented Architecture," Java Web Services Architecture, Chapter 2, Morgan Kaufmann, 2003.
[183] R.T. Fielding, "Architectural Styles and the Design of Network-Based Software Architectures," PhD dissertation, Univ. of California, Irvine, 2000.
[184] R.T. Fielding and R.N. Taylor, "Principled Design of the Modern Web Architecture," ACM Trans. Internet Technology, vol. 2, no. 2, pp. 115-150, May 2002.
[185] S. Vinoski, "REST Eye for the SOA Guy," IEEE Internet Computing, vol. 11, no. 1, pp. 82-84, Jan./Feb., 2007.
[186] ZeroC Ice, www.zeroc.comice.html, Oct. 2011.
[187] M. Henning and M. Spruiell Distributed Programming with Ice, ZeroC, 2003, www.zeroc.comIce-Manual.pdf, Oct. 2011.
[188] M. Henning, "A New Approach to Object-Oriented Middleware," IEEE Internet Computing, vol. 8, no. 1 pp. 66-75, Jan./Feb. 2004.
[189] Scopus, http://www.scopus.comhome.url, Oct. 2011.
[190] , "A Call to Arms for Parallel Programming Standards," HPCWire, SC10 Features, Nov. 2010.

