The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.05 - September/October (2010 vol.30)
pp: 16-29
Alex Ramirez , Barcelona Supercomputing Center
Felipe Cabarcas , Barcelona Supercomputing Center
Ben Juurlink , Technische Universitat Berlin
Mauricio Alvarez Mesa , Universitat Politecnica de Catalunya
Friman Sanchez , Universitat Politecnica de Catalunya
Arnaldo Azevedo , Delft University of Technology
Cor Meenderinck , Delft University of Technology
Catalin Ciobanu , Delft University of Technology
Sebastian Isaza , Delft University of Technology
Georgi Gaydadjiev , Delft University of Technology
ABSTRACT
<p>The SARC architecture is composed of multiple processor types and a set of user-managed direct memory access (DMA) engines that let the runtime scheduler overlap data transfer and computation. The runtime system automatically allocates tasks on the heterogeneous cores and schedules the data transfers through the DMA engines. SARC's programming model supports various highly parallel applications, with matching support from specialized accelerator processors.</p>
INDEX TERMS
multicore, heterogeneous architecture, accelerator, programming model
CITATION
Alex Ramirez, Felipe Cabarcas, Ben Juurlink, Mauricio Alvarez Mesa, Friman Sanchez, Arnaldo Azevedo, Cor Meenderinck, Catalin Ciobanu, Sebastian Isaza, Georgi Gaydadjiev, "The SARC Architecture", IEEE Micro, vol.30, no. 5, pp. 16-29, September/October 2010, doi:10.1109/MM.2010.79
REFERENCES
1. R.M. Badia et al., "Impact of the Memory Hierarchy on Shared Memory Architectures in Multicore Programming Models," Proc. 17th Euromicro Int'l Conf. Parallel, Distributed and Network-based Processing, IEEE CS Press, 2009, pp. 437-445.
2. R.D. Blumofe et al., "Cilk: An Efficient Multithreaded Runtime System," Proc. 5th ACM SIGPLAN Symp. Principles and Practice of Parallel Programming (PPoPP 95), ACM Press, 1995, pp. 207-216.
3. M.D. McCool et al., "Performance Evaluation of GPUs Using the RapidMind Development Platform," Proc. Conf. Supercomputing (SC 06), ACM Press, 2006, p. 181.
4. K. Fatahalian et al., "Sequoia: Programming the Memory Hierarchy," Proc. Conf. Supercomputing (SC 06), ACM Press, 2006, article 83.
5. A. Duran et al., "Extending the OpenMP Tasking Model to Allow Dependent Tasks," Proc. Int'l Conf. OpenMP in a New Era of Parallelism (IWOMP 08), LNCS, Springer, 2008, pp. 111-122.
6. F. Quintana et al., "Adding a Vector Unit to a Superscalar Processor," Proc. Int'l Conf. Supercomputing (SC 99), ACM Press, 1999, pp. 1-10.
7. M.G.H. Katevenis et al., "Explicit Communication and Synchronization in SARC," IEEE Micro, vol. 30, no. 5, 2010, pp. xx-xx.
8. B. Jacob, S.W. Ng, and D.T. Wang, Memory Systems: Cache, DRAM, Disk, Morgan Kaufmann, 2008.
9. C.H. Meenderinck and B.H.H. Juurlink, "Specialization of the Cell SPE for Media Applications," Proc. Int'l Conf. Application-Specific Systems, Architectures, and Processors, IEEE CS Press 2009, pp. 46-52.
10. T.F. Smith and M.S. Waterman, "Identification of Common Molecular Subsequences," J. Molecular Biology, vol. 147, no. 1, Mar. 1981, pp. 195-197.
11. T. Chen et al., "Cell Broadband Engine Architecture and Its First Implementation: A Performance View," IBM J. Research and Development, vol. 51, no. 5, 2007, pp. 559-572.
12. M. Alvarez et al., "HD-VideoBench. A Benchmark for Evaluating High Definition Digital Video Applications," Proc. IEEE Workload Characterization Symp., IEEE CS Press, 2007, pp. 120-125.
5 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool