The Community for Technology Leaders
RSS Icon
Issue No.01 - January-March (2009 vol.6)
pp: 18-31
Albert Meixner , Duke University, Durham
Daniel J. Sorin , Duke University, Durham
Multithreaded servers with cache-coherent shared memory are the dominant type of machines used to run critical network services and database management systems. To achieve the high availability required for these tasks, it is necessary to incorporate mechanisms for error detection and recovery. Correct operation of the memory system is defined by the memory consistency model. Errors can therefore be detected by checking if the observed memory system behavior deviates from the specified consistency model. Based on recent work, we design a framework for dynamic verification of memory consistency (DVMC). The framework consists of mechanisms to verify three invariants that are proven to guarantee that a specified memory consistency model is obeyed. We describe an implementation of the framework for the SPARCv9 architecture, and we experimentally evaluate its performance using full-system simulation of commercial workloads.
Error-checking, Multi-core/single-chip multiprocessors
Albert Meixner, Daniel J. Sorin, "Dynamic Verification of Memory Consistency in Cache-Coherent Multithreaded Computer Architectures", IEEE Transactions on Dependable and Secure Computing, vol.6, no. 1, pp. 18-31, January-March 2009, doi:10.1109/TDSC.2007.70243
[1] S.V. Adve and K. Gharachorloo, “Shared Memory Consistency Models: A Tutorial,” Computer, vol. 29, no. 12, pp. 66-76, Dec. 1996.
[2] A.R. Alameldeen, M.M. Martin, C.J. Mauer, K.E. Moore, M. Xu, M.D. Hill, D.A. Wood, and D.J. Sorin, “Simulating a $2M Commercial Server on a $2K PC,” Computer, vol. 36, no. 2, pp.50-57, Feb. 2003.
[3] T.M. Austin, “DIVA: A Reliable Substrate for Deep Submicron Microarchitecture Design,” Proc. 32nd Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO '99), pp. 196-207, Nov. 1999.
[4] P. Barford and M. Crovella, “Generating Representative Web Workloads for Network and Server Performance Evaluation,” Proc. ACM Sigmetrics '98, pp. 151-160, June 1998.
[5] H.W. Cain and M.H. Lipasti, “Verifying Sequential Consistency Using Vector Clocks,” Proc. 14th ACM Symp. Parallel Algorithms and Architectures (SPAA '02), Aug. 2002.
[6] H.W. Cain and M.H. Lipasti, “Memory Ordering: A Value-Based Approach,” Proc. 31st Ann. Int'l Symp. Computer Architecture (ISCA'04), June 2004.
[7] J.F. Cantin, M.H. Lipasti, and J.E. Smith, “Dynamic Verification of Cache Coherence Protocols,” Proc. Second Workshop Memory Performance Issues (WPI '01), June 2001.
[8] S. Chatterjee, C. Weaver, and T. Austin, “Efficient Checker Processor Design,” Proc. 33rd Ann. IEEE/ACM Int'l Symp. Microarchitecture (MICRO '00), pp. 87-97, Dec. 2000.
[9] K. Gharachorloo, A. Gupta, and J. Hennessy, “Two Techniques to Enhance the Performance of Memory Consistency Models,” Proc. Int'l Conf. Parallel Processing (ICPP '91), vol. I, pp. 355-364, Aug. 1991.
[10] K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy, “Memory Consistency and Event Ordering in Scalable Shared-Memory,” Proc. 17th Ann. Int'l Symp. Computer Architecture (ISCA '90), pp. 15-26, May 1990.
[11] P.B. Gibbons and E. Korach, “Testing Shared Memories,” SIAM J. Computing, vol. 26, no. 4, pp. 1208-1244, Aug. 1997.
[12] M.D. Hill, A.E. Condon, M. Plakal, and D.J. Sorin, “A System-Level Specification Framework for I/O Architectures,” Proc. 11thACM Symp. Parallel Algorithms and Architectures (SPAA '99), pp. 138-147, June 1999.
[13] International Technology Roadmap for Semiconductors, 2003.
[14] L. Lamport, “Time, Clocks and the Ordering of Events in a Distributed System,” Comm. ACM, vol. 21, no. 7, pp. 558-565, July 1978.
[15] L. Lamport, “How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs,” IEEE Trans. Computers, vol. 28, no. 9, pp. 690-691, Sept. 1979.
[16] P.S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hållberg, J. Högberg, F. Larsson, A. Moestedt, and B. Werner, “Simics: A Full System Simulation Platform,” Computer, vol. 35, no. 2, pp. 50-58, Feb. 2002.
[17] M.M. Martin, D.J. Sorin, B.M. Beckmann, M.R. Marty, M. Xu, A.R. Alameldeen, K.E. Moore, M.D. Hill, and D.A. Wood, “Multifacet's General Execution-Driven Multiprocessor Simulator (GEMS) Toolset,” Computer Architecture News, vol. 33, no. 4, pp.92-99, Sept. 2005.
[18] C.J. Mauer, M.D. Hill, and D.A. Wood, “Full System Timing-First Simulation,” Proc. ACM Sigmetrics '02, pp. 108-116, June 2002.
[19] A. Meixner and D.J. Sorin, “Dynamic Verification of Sequential Consistency,” Proc. 32nd Ann. Int'l Symp. Computer Architecture (ISCA '05), pp. 482-493, June 2005.
[20] S.S. Mukherjee, M. Kontz, and S.K. Reinhardt, “Detailed Design and Implementation of Redundant Multithreading Alternatives,” Proc. 29th Ann. Int'l Symp. Computer Architecture (ISCA '02), pp. 99-110, May 2002.
[21] N. Oh, P.P. Shirvani, and E.J. McCluskey, “Error Detection by Duplicated Instructions in Super-Scalar Processors,” IEEE Trans. Reliability, vol. 51, no. 1, pp. 63-74, Mar. 2002.
[22] M. Plakal, D.J. Sorin, A.E. Condon, and M.D. Hill, “Lamport Clocks: Verifying a Directory Cache-Coherence Protocol,” Proc. 10th ACM Symp. Parallel Algorithms and Architectures (SPAA '98), pp. 67-76, June 1998.
[23] F. Pong and M. Dubois, “Verification Techniques for Cache Coherence Protocols,” ACM Computing Surveys, vol. 29, no. 1, pp.82-126, Mar. 1997.
[24] F. Pong and M. Dubois, “Formal Automatic Verification of CacheCoherence in Multiprocessors with Relaxed Memory Models,” IEEE Trans. Parallel and Distributed Systems, vol. 11, no. 9, pp. 989-1006, Sept. 2000.
[25] M. Prvulovic, Z. Zhang, and J. Torrellas, “ReVive: Cost-Effective Architectural Support for Rollback Recovery in Shared-Memory Multiprocessors,” Proc. 29th Ann. Int'l Symp. Computer Architecture (ISCA '02), pp. 111-122, May 2002.
[26] R. Rajwar, A. Kägi, and J.R. Goodman, “Improving the Throughputof Synchronization by Insertion of Delays,” Proc. Sixth IEEE Symp. High-Performance Computer Architecture (HPCA'00), pp. 168-179, Jan. 2000.
[27] S.K. Reinhardt and S.S. Mukherjee, “Transient Fault Detection viaSimultaneous Multithreading,” Proc. 27th Ann. Int'l Symp. Computer Architecture (ISCA '00), pp. 25-36, June 2000.
[28] E. Rotenberg, “AR-SMT: A Microarchitectural Approach to FaultTolerance in Microprocessors,” Proc. 29th IEEE Int'l Symp. Fault-Tolerant Computing (FTCS '99), pp. 84-91, June 1999.
[29] J.H. Saltzer, D.P. Reed, and D.D. Clark, “End-to-End Arguments in Systems Design,” ACM Trans. Computer Systems, vol. 2, no. 4, pp. 277-288, Nov. 1984.
[30] P. Shivakumar, M. Kistler, S.W. Keckler, D. Burger, and L. Alvisi, “Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '02), June 2002.
[31] D.J. Sorin, M.D. Hill, and D.A. Wood, “Dynamic Verification of End-to-End Multiprocessor Invariants,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '03), pp. 281-290, June 2003.
[32] D.J. Sorin, M.M. Martin, M.D. Hill, and D.A. Wood, “SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery,” Proc. 29th Ann. Int'l Symp. Computer Architecture (ISCA '02), pp. 123-134, May 2002.
[33] J. Srinivasan, S.V. Adve, P. Bose, and J.A. Rivers, “The Impact of Technology Scaling on Lifetime Reliability,” Proc. Int'l Conf. Dependable Systems and Networks (DSN '04), June 2004.
[34] D.M. Tullsen, S.J. Eggers, J.S. Emer, H.M. Levy, J.L. Lo, and R.L. Stamm, “Exploiting Choice: Instruction Fetch and Issue on an Implementable Simultaneous Multithreading Processor,” Proc. 23rd Ann. Int'l Symp. Computer Architecture (ISCA '96), pp. 191-202, May 1996.
[35] SPARC Architecture Manual (Version 9), D.L. Weaver and T. Germond, eds. PTR Prentice Hall, 1994.
[36] S.C. Woo, M. Ohara, E. Torrie, J.P. Singh, and A. Gupta, “TheSPLASH-2 Programs: Characterization and Methodological Considerations,” Proc. 22nd Ann. Int'l Symp. Computer Architecture (ISCA '95), pp. 24-37, June 1995.
15 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool