|
| This Article | ||
| ||
| Share | ||
| Bibliographic References | ||
| Add to: | ||
| | ||
| Search | ||
| ||
| ASCII Text | x | ||
| Costas Kyriacou, Paraskevas Evripidou, Pedro Trancoso, "Data-Driven Multithreading Using Conventional Microprocessors," IEEE Transactions on Parallel and Distributed Systems, vol. 17, no. 10, pp. 1176-1188, October, 2006. | |||
| BibTex | x | ||
| @article{ 10.1109/TPDS.2006.136, author = {Costas Kyriacou and Paraskevas Evripidou and Pedro Trancoso}, title = {Data-Driven Multithreading Using Conventional Microprocessors}, journal ={IEEE Transactions on Parallel and Distributed Systems}, volume = {17}, number = {10}, issn = {1045-9219}, year = {2006}, pages = {1176-1188}, doi = {http://doi.ieeecomputersociety.org/10.1109/TPDS.2006.136}, publisher = {IEEE Computer Society}, address = {Los Alamitos, CA, USA}, } | |||
| RefWorks Procite/RefMan/Endnote | x | ||
| TY - JOUR JO - IEEE Transactions on Parallel and Distributed Systems TI - Data-Driven Multithreading Using Conventional Microprocessors IS - 10 SN - 1045-9219 SP1176 EP1188 EPD - 1176-1188 A1 - Costas Kyriacou, A1 - Paraskevas Evripidou, A1 - Pedro Trancoso, PY - 2006 KW - Dataflow KW - multithreading KW - nonblocking threads KW - cache prefetching KW - multiprocessors KW - network of workstations KW - high performance computing. VL - 17 JA - IEEE Transactions on Parallel and Distributed Systems ER - | |||
Abstract—This paper describes the Data-Driven Multithreading (DDM) model and how it may be implemented using off-the-shelf microprocessors. Data-Driven Multithreading is a nonblocking multithreading execution model that tolerates internode latency by scheduling threads for execution based on data availability. Scheduling based on data availability can be used to exploit cache management policies that reduce significantly cache misses. Such policies include firing a thread for execution only if its data is already placed in the cache. We call this cache management policy the CacheFlow policy. The core of the DDM implementation presented is a memory mapped hardware module that is attached directly to the processor's bus. This module is responsible for thread scheduling and is known as the Thread Synchronization Unit (TSU). The evaluation of DDM was performed using simulation of the Data-Driven Network of Workstations (
[1] P. Evripidou, “D3-Machine: A Decoupled Data-Driven Multithreaded Architecture with Variable Resolution Support,” Parallel Computing, vol. 27, no. 9, pp. 1197-1225, 2001.
[2] P. Evripidou and J.-L. Gaudiot, “A Decoupled Graph/Computation Data-Driven Architecture with Variable-Resolution Actors,” Proc. 1990 Int'l Conf. Parallel Processing (ICPP), pp. 405-414, Aug. 1990.
[3] A. Agarwal et al., “Sparcle: An Evolutionary Processor Design for Multiprocessors,” IEEE Micro, vol. 13, pp. 48-61, June 1993.
[4] K. Kavi, R. Giorgi, and J. Arul, “Scheduled Dataflow: Execution Paradigm, Architecure, and Performance Evaluation,” IEEE Trans. Computers, vol. 50, no. 8, pp. 834-846, Aug. 2001.
[5] R.S. Nikhil, G.M. Papadopoulos, and Arvind, “*T: A Multithreaded Massively Parallel Architecture,” Proc. Int'l Symp. Computer Architecture (ISCA), pp. 156-167, 1992.
[6] H. Hum et al., “A Design Study of the EARTH Multiprocessor,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '95), pp. 59-68, June 1995.
[7] D. Culler et al., “TAM: A Compiler Controlled Threaded Abstract Machine,” J. Parallel and Distributed Computing, vol. 18, no. 3, pp. 347-370, 1993.
[8] S. Woo et al., “The SPLASH-2 Programs: Characterization and Methodological Considerations,” Proc. 22nd Ann. Int'l Symp. Computer Architecture (ISCA), pp. 24-36, June 1995.
[9] C. Kyriacou, P. Evripidou, and P. Trancoso, “Cacheflow: A Short-Term Optimal Cache Management Policy for Data Driven Multithreading,” Proc. EuroPar-04, pp. 561-570, Aug. 2004.
[10] C. Kyriacou and P. Evripidou, “Communication Assist for Data Driven Multithreading,” Advances in Informatics, (LNCS2563), Springer-Verlang, pp. 351-367, 2002.
[11] Intel, IA-32 Intel Architecture: Software Developers Manual, Series System Programming Guide, Intel, vol. 3, 2003.
[12] C. Kyriacou, “Data Driven Multithreading Using Conventional Control Flow Microprocessors,” PhD Thesis, Dept. of Computer Science, Univ. of Cyprus, 2005.
[13] A. Bilas, C. Liao, and J.P. Singh, “Using Network Interface Support to Avoid Asynchronous Protocol Processing in Shared Virtual Memory Systems,” Proc. Int'l Symp. Computer Architecture (ISCA), pp. 282-293, 1999.
[14] G. Papadopoulos and D. Culler, “Monsoon: An Explicid Token Store Architecture,” Proc. 17th Ann. Int'l Symp. Computer Architecture (ISCA), pp. 82-91, May 1990.
[15] B. Shankar, L. Roh, W. Bohm, and W. Najjar, “Control of Loop Parallelism in Multithreaded Code,” Proc. Int'l Conf. Parallel Architectures and Compilation Techniques (PACT '95), pp. 131-139, June 1995.
[16] L. Roh and W.A. Najjar, “Design of Storage Hierarchy in Multithreaded Architectures,” Proc. Int'l Symp. Microarchitecture (Micro-28), pp. 271-278, Nov. 1995.
[17] D. Burger et al., “Scaling to the End of Silicon with EDGE Architectures,” Computer, vol. 37, no. 7, pp. 44-55, July 2004.
[18] S. Swanson and M. Oskin, “WaveScalar,” Proc. Int'l Symp. Microarchitecture (Micro-36), pp. 291-302, Nov. 2003.
[19] D.K. Poulsen and P.-C. Yew, “Data Prefetching and Data Forwarding in Shared Memory Multiprocessors,” Proc. Int'l Conf. Parallel Processing (ICPP), pp. 276-280, Aug. 1994.
[20] P. Trancoso and J. Torrellas, “The Impact of Speeding up Critical Sections with Data Prefetching and Forwarding,” Proc. Int'l Conf. Parallel Processing (ICPP), vol. 3, pp. 79-86, 1996.
[21] J.D. Collins, S. Sair, B. Calder, and D.M. Tullsen, “Pointer Cache Assisted Prefetching,” Proc. 35th Ann. Int'l Symp. Microarchitecture (MICRO-35), pp. 62-73, Nov. 2002.
[22] T. Sherwood, S. Sair, and B. Calder, “Predictor-Directed Stream Buffers,” Proc. 33rd Int'l Symp. Microarchitecture (MICRO-33), pp. 42-53, Dec. 2000.
[23] C.-K. Luk and T.C. Mowry, “Compiler-Based Prefetching for Recursive Data Structures,” Proc. Seventh Int'l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS-VII), pp. 222-233, Oct. 1996.
[24] J.D. Collins, D.M. Tullsen, H. Wang, and J.P. Shen, “Dynamic Speculative Precomputation,” Proc. 34th Ann. Int'l Symp. Microarchitecture (MICRO-34), pp. 306-317, Dec. 2001.
[25] A. Roth and G.S. Sohi, “Speculative Data-Driven Multithreading,” Proc. Seventh Int'l Symp. High-Performance Computer Architecture (HPCA), pp. 37-48, Jan. 2001.
[26] P. Evripidou and C. Kyriacou, “Data Driven Network of Workstations (${\rm{D}}^2{\rm{NOW}}$ ),” J. Universal Computer Science, vol. 6, no. 10, pp. 1015-1033, 2000.

