Search For:

Displaying 1-50 out of 91 total
TokenTM: Efficient Execution of Large Transactions with Hardware Transactional Memory
Found in: Computer Architecture, International Symposium on
By Jayaram Bobba, Neelam Goyal, Mark D. Hill, Michael M. Swift, David A. Wood
Issue Date:June 2008
pp. 127-138
Current hardware transactional memory systems seek to simplify parallel programming, but assume that large transactions are rare, so it is acceptable to penalize their performance or concurrency. However, future programmers may wish to use large transactio...
 
Multicast Snooping: A New Coherence Method Using a Multicast Address Network
Found in: Computer Architecture, International Symposium on
By E. Ender Bilir, Ross M. Dickson, Ying Hu, Manoj Plakal, Daniel J. Sorin, Mark D. Hill, David A. Wood
Issue Date:May 1999
pp. 0294
This paper proposes a new coherence method called
 
Coherence Ordering for Ring-based Chip Multiprocessors
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Michael R. Marty, Mark D. Hill
Issue Date:December 2006
pp. 309-320
<p>Ring interconnects may be an attractive solution for future chip multiprocessors because they can enable faster links than buses and simpler switches than arbitrary switched interconnects. Moreover, a ring naturally orders requests sufficiently to...
 
Improving Multiple-CMP Systems Using Token Coherence
Found in: High-Performance Computer Architecture, International Symposium on
By Michael R. Marty, Jesse D. Bingham, Mark D. Hill, Alan J. Hu, Milo M. K. Martin, David A. Wood
Issue Date:February 2005
pp. 328-339
Improvements in semiconductor technology now enable Chip Multiprocessors (CMPs). As many future computer systems will use one or more CMPs and support shared memory, such systems will have caches that must be kept coherent.<div></div> Coherence...
 
A
Found in: Computer Architecture, International Symposium on
By Min Xu, Rastislav Bodik, Mark D. Hill
Issue Date:June 2003
pp. 122
Debuggers have been proven indispensable in improving software reliability. Unfortunately, on most real-life software, debuggers fail to deliver their most essential feature - a faithful replay of the execution. The reason is non-determinism caused by mult...
 
Using Destination-Set Prediction to Improve the Latency/Bandwidth Tradeoff in Shared-Memory Multiprocessors
Found in: Computer Architecture, International Symposium on
By Milo M. K. Martin, Pacia J. Harper, Daniel J. Sorin, Mark D. Hill, David A. Wood
Issue Date:June 2003
pp. 206
Destination-set prediction can improve the latency/bandwidth tradeoff in shared-memory multiprocessors. The destination set is the collection of processors that receive a particular coherence request. Snooping protocols send requests to the maximal destina...
 
Token Coherence: Decoupling Performance and Correctness
Found in: Computer Architecture, International Symposium on
By Milo M. K. Martin, Mark D. Hill, David A. Wood
Issue Date:June 2003
pp. 182
Many future shared-memory multiprocessor servers will both target commercial workloads and use highly-integrated
 
Cache Performance of the SPEC92 Benchmark Suite
Found in: IEEE Micro
By Jeffrey D. Gee, Mark D. Hill, Dionisios N. Pnevmatikatos, Alan Jay Smith
Issue Date:July 1993
pp. 17-27
<p>The authors consider whether SPECmarks, the figures of merit obtained from running the SPEC benchmarks under certain specified conditions, accurately indicate the performance to be expected from real, live work loads. Miss ratios for the entire se...
 
LogTM-SE: Decoupling Hardware Transactional Memory from Caches
Found in: High-Performance Computer Architecture, International Symposium on
By Luke Yen, Jayaram Bobba, Michael R. Marty, Kevin E. Moore, Haris Volos, Mark D. Hill, Michael M. Swift, David A. Wood
Issue Date:February 2007
pp. 261-272
This paper proposes a hardware transactional memory (HTM) system called LogTM Signature Edition (LogTM-SE). LogTM-SE uses signatures to summarize a transaction's read-and write-sets and detects conflicts on coherence requests (eager conflict detection). Tr...
 
Notary: Hardware techniques to enhance signatures
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Luke Yen, Stark C. Draper, Mark D. Hill
Issue Date:November 2008
pp. 234-245
Hardware signatures have been recently proposed as an efficient mechanism to detect conflicts amongst concurrently running transactions in transactional memory systems (e.g., Bulk, LogTM-SE, and SigTM). Signatures use fixed hardware to represent an unbound...
 
Dynamic Verification of End-to-End Multiprocessor Invariants
Found in: Dependable Systems and Networks, International Conference on
By Daniel J. Sorin, Mark D. Hill, David A. Wood
Issue Date:June 2003
pp. 281
As implementations of shared memory multiprocessors become more complicated, hardware faults will increasingly cause errors that are difficult or impossible to detect with low-level, localized mechanisms. In this paper, we argue for dynamic verification (i...
 
Rerun: Exploiting Episodes for Lightweight Memory Race Recording
Found in: Computer Architecture, International Symposium on
By Derek R. Hower, Mark D. Hill
Issue Date:June 2008
pp. 265-276
Multiprocessor deterministic replay has many potential uses in the era of multicore computing, including enhanced debugging, fault tolerance, and intrusion detection. While sources of nondeterminism in a uniprocessor can be recorded efficiently in software...
 
Single-Threaded vs. Multithreaded: Where Should We Focus?
Found in: IEEE Micro
By Joel Emer, Mark D. Hill, Yale N. Patt, Joshua J. Yi, Derek Chiou, Resit Sendag
Issue Date:November 2007
pp. 14-24
To continue to offer improvements in application performance, should computer architecture researchers and chip manufacturers focus on improving single-threaded or multithreaded performance? This panel, from the 2007 Workshop on Computer Architecture Resea...
 
Using Interaction Costs for Microarchitectural Bottleneck Analysis
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Brian A. Fields, Rastislav Bodík, Mark D. Hill, Chris J. Newburn
Issue Date:December 2003
pp. 228
Attacking bottlenecks in modern processors is difficult because many microarchitectural events overlap with each other. This parallelism makes it difficult to both (a) assign a cost to an event (e.g., to one of two overlapping cache misses) and (b) assign ...
 
Token Coherence: A New Framework for Shared-Memory Multiprocessors
Found in: IEEE Micro
By Milo M.K. Martin, Mark D. Hill, David A. Wood
Issue Date:November 2003
pp. 108-116
<p>Commercial workload and technology trends are pushing existing shared-memory multiprocessor coherence protocols in divergent directions. Token Coherence provides a framework for new coherence protocols that can reconcile these opposing trends.<...
 
A Hardware Memory Race Recorder for Deterministic Replay
Found in: IEEE Micro
By Min Xu, Rastislav Bodík, Mark D. Hill
Issue Date:January 2007
pp. 48-55
The Flight Data Recorder continually logs memory races in a multithreaded execution, enabling the deterministic replay invaluable for debugging concurrency errors, yet adds only modest hardware to a multicore chip. In experiments, recording incurred less t...
 
SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery
Found in: Computer Architecture, International Symposium on
By Daniel J. Sorin, Milo M.K. Martin, Mark D. Hill, David A. Wood
Issue Date:May 2002
pp. 0123
We develop an availability solution, called SafetyNet, that uses a unified, lightweight checkpoint/recovery mechanism to support multiple long-latency fault detection schemes. At an abstract level, SafetyNet logically maintains multiple, globally consisten...
 
Implementing Signatures for Transactional Memory
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Daniel Sanchez, Luke Yen, Mark D. Hill, Karthikeyan Sankaralingam
Issue Date:December 2007
pp. 123-133
Transactional Memory (TM) systems must track the read and write sets--items read and written during a transaction--to detect conflicts among concurrent trans- actions. Several TMs use signatures, which summarize unbounded read/write sets in bounded hardwar...
 
Safe and efficient supervised memory systems
Found in: High-Performance Computer Architecture, International Symposium on
By Jayaram Bobba, Marc Lupon, Mark D. Hill, David A. Wood
Issue Date:February 2011
pp. 369-380
Supervised Memory systems use out-of-band metabits to control and monitor accesses to normal data memory for such purposes as transactional memory and memory typestate trackers. Previous proposals demonstrate the value of supervised memory systems, but hav...
 
Interaction Cost: For When Event Counts Just Don't Add Up
Found in: IEEE Micro
By Brian A. Fields, Mark D. Hill, Chris J. Newburn
Issue Date:November 2004
pp. 57-61
Interaction cost helps improve processor performance and decrease power consumption by identifying when designers can choose among a set of optimizations and when it's necessary to perform them all.
 
Guest Editors' Introduction: Design Challenges for High-Performance Network Interfaces
Found in: Computer
By Andrew A. Chien, Mark D. Hill, Shubhendu S. Mukherjee
Issue Date:November 1998
pp. 42-44
<p>A network interface is a device that allows a computer to communicate with a network. Network interface design has a crucial impact on communication efficiency. It determines the cost of communication actions, moving data, and providing applicatio...
 
Cost-Effective Parallel Computing
Found in: Computer
By David A. Wood, Mark D. Hill
Issue Date:February 1995
pp. 69-72
<p>Large memories can make parallel computing cost-effective even with modest speedups.</p>
 
Using Speculation to Simplify Multiprocessor Design
Found in: Parallel and Distributed Processing Symposium, International
By Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill, David A. Wood
Issue Date:April 2004
pp. 75a
<p>Modern multiprocessors are complex systems that often require years to design and verify. A significant factor is that engineers must allocate a disproportionate share of their effort to ensure that rare corner-case events behave correctly. This p...
 
Slack: Maximizing Performance Under Technological Constraints
Found in: Computer Architecture, International Symposium on
By Brian Fields, Rastislav Bodik, Mark D. Hill
Issue Date:May 2002
pp. 0047
Many emerging processor microarchitectures seek to manage technological constraints (e.g., wire delay, power, and circuit complexity) by resorting to non-uniform designs that provide resources at multiple quality levels (e.g., fast/slow bypass paths, multi...
 
Making Network Interfaces Less Peripheral
Found in: Computer
By Shubhendu S. Mukherjee, Mark D. Hill
Issue Date:October 1998
pp. 70-76
<p>A barrier to delivering improvements in network bandwidth and latency to users is the network interface (NI), which connects a network to the host computer that runs the network software. An NI includes hardware that exposes an internal interface-...
 
Optimistic Simulation of Parallel Architectures Using Program Executables
Found in: Parallel and Distributed Simulation, Workshop on
By Sashikanth Chandrasekaran, Mark. D. Hill
Issue Date:May 1996
pp. 0143
A key tool of computer architects is computer simulation at the level of detail that can execute program executables. The time and memory requirements of such simulations can be enormous, especially when the machine under design-the target-is a parallel ma...
 
Amdahl's Law in the Multicore Era
Found in: Computer
By Mark D. Hill, Michael R. Marty
Issue Date:July 2008
pp. 33-38
Augmenting Amdahl's law with a corollary for multicore hardware makes it relevant to future generations of chips with multiple processor cores. Obtaining optimal multicore performance will require further research in both extracting more parallelism and ma...
 
Virtual Hierarchies
Found in: IEEE Micro
By Michael R. Marty, Mark D. Hill
Issue Date:January 2008
pp. 99-109
Abundant cores per chip will encourage a greater use of space sharing, where work stays on a group of cores for long time intervals. Virtual hierarchies can improve performance and performance isolation of space-shared workloads, while still supporting glo...
 
Supporting Very Large DRAM Caches with Compound-Access Scheduling and MissMap
Found in: IEEE Micro
By Gabriel H. Loh,Mark D. Hill
Issue Date:May 2012
pp. 70-78
This work efficiently enables conventional block sizes for very large die-stacked DRAM caches with two innovations: it makes hits faster with compound-access scheduling and misses faster with a MissMap. The combination of these mechanisms enables the new o...
 
Calvin: Deterministic or not? Free will to choose
Found in: High-Performance Computer Architecture, International Symposium on
By Derek R Hower, Polina Dudnik, Mark D. Hill, David A. Wood
Issue Date:February 2011
pp. 333-334
Most shared memory systems maximize performance by unpredictably resolving memory races. Unpredictable memory races can lead to nondeterminism in parallel programs, which can suffer from hard-to-reproduce hiesenbugs. We introduce Calvin, a shared memory mo...
 
StealthTest: Low Overhead Online Software Testing Using Transactional Memory
Found in: Parallel Architectures and Compilation Techniques, International Conference on
By Jayaram Bobba, Weiwei Xiong, Luke Yen, Mark D. Hill, David A. Wood
Issue Date:September 2009
pp. 146-155
Software testing is hard. The emergence of multicore architectures and the proliferation of bugprone multithreaded software makes testing even harder. To this end, researchers have proposed methods to continue testing software after deployment, e.g., in vi...
 
Performance Pathologies in Hardware Transactional Memory
Found in: IEEE Micro
By Jayaram Bobba, Kevin E. Moore, Haris Volos, Luke Yen, Mark D. Hill, Michael M. Swift, David A. Wood
Issue Date:January 2008
pp. 32-41
Transactional memory is a promising approach to ease parallel programming. Hardware transactional memory system designs reflect choices along three key design dimensions: conflict detection, version management, and conflict resolution. The authors identify...
 
A Future of Parallel Computer Architectures
Found in: Parallel Processing, International Conference on
By Mark D. Hill
Issue Date:August 2004
pp. 2
No summary available.
   
Challenges in Computer Architecture Evaluation
Found in: Computer
By Kevin Skadron, Margaret Martonosi, David I. August, Mark D. Hill, David J. Lilja, Vijay S. Pai
Issue Date:August 2003
pp. 30-36
<p>Reasoning about today's tremendously complex computer systems is difficult and developing them is expensive. Detailed software simulations are thus essential for evaluating computer architecture ideas. Industry uses simulation extensively during p...
 
Simulating a $2M Commercial Server on a $2K PC
Found in: Computer
By Alaa R. Alameldeen, Milo M.K. Martin, Carl J. Mauer, Kevin E. Moore, Min Xu, Mark D. Hill, David A. Wood, Daniel J. Sorin
Issue Date:February 2003
pp. 50-57
<p>As dependence on database management systems and Web servers increases, so does the need for them to run reliably and efficiently—goals that rigorous simulations can help achieve. Execution-driven simulation models system hardware. These simulatio...
 
Specifying and Verifying a Broadcast and a Multicast Snooping Cache Coherence Protocol
Found in: IEEE Transactions on Parallel and Distributed Systems
By Daniel J. Sorin, Manoj Plakal, Anne E. Condon, Mark D. Hill, Milo M.K. Martin, David A. Wood
Issue Date:June 2002
pp. 556-578
<p>In this paper, we develop a specification methodology that documents and specifies a cache coherence protocol in eight tables: the states, events, actions, and transitions of the cache and memory controllers. We then use this methodology to specif...
 
Correctly Implementing Value Prediction in Microprocessors that Support Multithreading or Multiprocessing
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Milo M. K. Martin, Daniel J. Sorin, Harold W. Cain, Mark D. Hill, Mikko Lipasti
Issue Date:December 2001
pp. 328
<p>This paper explores the interaction of value prediction with thread-level parallelism techniques, including multithreading and multiprocessing, where correctness is defined by a memory consistency model. Value prediction subtly interacts with the ...
 
Making Pointer-Based Data Structures Cache Conscious
Found in: Computer
By Trishul M. Chilimbi, Mark D. Hill, James R. Larus
Issue Date:December 2000
pp. 67-74
<p>Rapid increases in processor speed and slower increases in memory speed have produced memory access times that exceed the cost of simple, arithmetic operations. The ubiquitous hardware solution to this problem is memory caches, which exploit progr...
 
Wisconsin Wind Tunnel II: A Fast, Portable Parallel Architecture Simulator
Found in: IEEE Concurrency
By Shubhendu S. Mukherjee, Steven K. Reinhardt, Babak Falsafi, Mike Litzkow, Mark D. Hill, David A. Wood, Steven Huss-Lederman, James R. Larus
Issue Date:October 2000
pp. 12-20
Analysis of future parallel computers requires rapidly simulating target designs running realistic workloads. Two techniques have accelerated such simulations: direct execution and using a parallel host. Historically, these techniques have lacked portabili...
 
Multiprocessors Should Support Simple Memory-Consistency Models
Found in: Computer
By Mark D. Hill
Issue Date:August 1998
pp. 28-34
In the future, many computers will contain multiple processors, in part because the marginal cost of adding a few additional processors is so low that only minimal performance gain is needed to make the additional processors cost-effective. Intel, for exam...
 
Where Is Software Headed? A Virtual Roundtable
Found in: Computer
By Dave Power, Bertrand Meyer, Jack Grimes, Mike Potel, Ron Vetter, Phil Laplante, Wolfgang Pree, Gustav Pomberger, Mark D. Hill, James R. Larus, David A. Wood, Hersham El-Rewini, Bruce W. Weide
Issue Date:August 1995
pp. 20-32
To find out where software is headed, we took to the Internet, asking experts in academia and industry to share their vision as to the future of software. For this
 
Guest Editor's Introduction: Hot Chips II Symposium
Found in: IEEE Micro
By Mark D. Hill, David A. Wood
Issue Date:May 1991
pp. 8-9
No summary available.
   
A Case for Direct-Mapped Caches
Found in: Computer
By Mark D. Hill
Issue Date:December 1988
pp. 25-40
<p>Direct-mapped caches are defined, and it is shown that trends toward larger cache sizes and faster hit times favor their use. The arguments are restricted initially to single-level caches in uniprocessors. They are then extended to two-level cache...
 
Two hardware-based approaches for deterministic multiprocessor replay
Found in: Communications of the ACM
By Derek R. Hower, Josep Torrellas, Luis Ceze, Mark D. Hill, Pablo Montesinos, Derek R. Hower, Josep Torrellas, Luis Ceze, Mark D. Hill, Pablo Montesinos
Issue Date:June 2009
pp. 101-104
Modern computer systems are inherently nondeterministic due to a variety of events that occur during an execution, including I/O, interrupts, and DMA fills. The lack of repeatability that arises from this nondeterminism can make it difficult to develop and...
     
Supporting x86-64 address translation for 100s of GPU lanes
Found in: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)
By Jason Power,Mark D. Hill,David A. Wood
Issue Date:February 2014
pp. 568-578
Efficient memory sharing between CPU and GPU threads can greatly expand the effective set of GPGPU workloads. For increased programmability, this memory should be uniformly virtualized, necessitating compatible address translation support for GPU memory re...
   
QuickRelease: A throughput-oriented approach to release consistency on GPUs
Found in: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)
By Blake A. Hechtman,Shuai Che,Derek R. Hower,Yingying Tian,Bradford M. Beckmann,Mark D. Hill,Steven K. Reinhardt,David A. Wood
Issue Date:February 2014
pp. 189-200
Graphics processing units (GPUs) have specialized throughput-oriented memory systems optimized for stream-ing writes with scratchpad memories to capture locality explicitly. Expanding the utility of GPUs beyond graphics encourages designs that simplify pro...
   
Coherent Network Interfaces for Fine-Grain Communication
Found in: Computer Architecture, International Symposium on
By Mark D. Hill, Babak Falsafi, David A. Wood, Shubhendu S. Mukherjee
Issue Date:May 1996
pp. 247
Historically, processor accesses to memory-mapped device registers have been marked uncachable to insure their visibility to the device. The ubiquity of snooping cache coherence, however, makes it possible for processors and devices to interact with cachab...
 
A Wiki for discussing and promoting best practices in research
Found in: Communications of the ACM
By Donna Baglio, Jean-Luc Gaudiot, Joe Marks, Mark D. Hill, Mary Hall, Paolo Prinetto
Issue Date:September 2006
pp. 63-64
Dealing with the demands of escalating paper submissions is a daunting challenge for conference organizers and program chairs. ACM and IEEE have joined forces to create a forum for sharing ideas on the best ways to handle it all.
     
Efficient support for irregular applications on distributed-memory machines
Found in: Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming (PPOPP '95)
By Anne Rogers, James R. Larus, Joel Saltz, Mark D. Hill, Shamik D. Sharma, Shubhendu S. Mukherjee
Issue Date:July 1995
pp. 201-211
Irregular computation problems underlie many important scientific applications. Although these problems are computationally expensive, and so would seem appropriate for parallel machines, their irregular and unpredictable run-time behavior makes this type ...
     
Heterogeneous-race-free memory models
Found in: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems (ASPLOS '14)
By Benedict R. Gaster, Blake A. Hechtman, Bradford M. Beckmann, David A. Wood, Derek R. Hower, Mark D. Hill, Steven K. Reinhardt
Issue Date:March 2014
pp. 427-440
Commodity heterogeneous systems (e.g., integrated CPUs and GPUs), now support a unified, shared memory address space for all components. Because the latency of global communication in a heterogeneous system can be prohibi-tively high, heterogeneous systems...
     
 1  2 Next >>