Search For:

Displaying 1-24 out of 24 total
BugNet: Continuously Recording Program Execution for Deterministic Replay Debugging
Found in: Computer Architecture, International Symposium on
By Satish Narayanasamy, Gilles Pokam, Brad Calder
Issue Date:June 2005
pp. 284-295
<p>Significant time is spent by companies trying to reproduce and fix the bugs that occur for released code. To assist developers, we propose the BugNet architecture to continuously record information on production runs. The information collected bef...
 
A Dependency Chain Clustered Microarchitecture
Found in: Parallel and Distributed Processing Symposium, International
By Satish Narayanasamy, Hong Wang, Perry Wang, John Shen, Brad Calder
Issue Date:April 2005
pp. 21b
In this paper we explore a new clustering approach for reducing the complexity of wide issue in-order processors based on EPIC architectures. Complexity effectiveness is achieved by heavily clustering the pipeline from decode to commit stage without the ne...
 
Catching Accurate Profiles in Hardware
Found in: High-Performance Computer Architecture, International Symposium on
By Satish Narayanasamy, Timothy Sherwood, Suleyman Sair, Brad Calder, George Varghese
Issue Date:February 2003
pp. 269
<p>Run-time optimization is one of the most important ways of getting performance out of modern processors. Techniques such as prefetching, trace caching, memory disambiguation etc., are all based upon the principle of observation followed by adaptat...
 
A Safety-First Approach to Memory Models
Found in: IEEE Micro
By Abhayendra Singh,Satish Narayanasamy,Daniel Marino,Todd Millstein,Madanlal Musuvathi
Issue Date:May 2013
pp. 96-104
Recent efforts to standardize concurrency semantics for programming languages require programmers to explicitly annotate all memory accesses that can participate in a data race ("unsafe" accesses). This requirement allows the compiler and hardwar...
 
Offline symbolic analysis to infer Total Store Order
Found in: High-Performance Computer Architecture, International Symposium on
By Dongyoon Lee, Mahmoud Said, Satish Narayanasamy, Zijiang Yang
Issue Date:February 2011
pp. 357-358
Ability to record and replay an execution can significantly help programmers debug their programs, especially parallel programs. De-terministically replaying a multiprocessor's execution under a relaxed memory model has remained a challenging problem. This...
 
Tolerating Concurrency Bugs Using Transactions as Lifeguards
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Jie Yu, Satish Narayanasamy
Issue Date:December 2010
pp. 263-274
Parallel programming is hard, because it is impractical to test all possible thread interleavings. One promising approach to improve a multi-threaded program’s reliability is to constrain a production run’s thread interleavings in such a way that untested ...
 
Patching Processor Design Errors with Programmable Hardware
Found in: IEEE Micro
By Smruti Sarangi, Satish Narayanasamy, Bruce Carneal, Abhishek Tiwari, Brad Calder, Josep Torrellas
Issue Date:January 2007
pp. 12-25
Equipping processors with programmable hardware to patch design errors lets manufacturers release regular hardware patches, avoiding costly chip recalls and potentially speeding time to market. For each error detected, the manufacturer creates a fingerprin...
 
BugNet: Recording Application-Level Execution for Deterministic Replay Debugging
Found in: IEEE Micro
By Satish Narayanasamy, Gilles Pokam, Brad Calder
Issue Date:January 2006
pp. 100-109
With software's increasing complexity, providing efficient hardware support for software debugging is critical. Hardware support is necessary to observe and capture, with little or no overhead, the exact execution of a program. Providing this ability to de...
 
Creating Converged Trace Schedules Using String Matching
Found in: High-Performance Computer Architecture, International Symposium on
By Satish Narayanasamy, Yuanfang Hu, Suleyman Sair, Brad Calder
Issue Date:February 2004
pp. 210
This paper focuses on generating efficient software pipelined schedules for in-order machines, which we call Converged Trace Schedules. For a candidate loop, we form a string of trace block identifiers by hashing together addresses of aggressively schedule...
 
Catnap: energy proportional multiple network-on-chip
Found in: Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA '13)
By Reetuparna Das, Ronald G. Dreslinski, Satish Narayanasamy, Sudhir K. Satpathy
Issue Date:June 2013
pp. 320-331
Multiple networks have been used in several processor implementations to scale bandwidth and ensure protocol-level deadlock freedom for different message classes. In this paper, we observe that a multiple-network design is also attractive from a power pers...
     
Parallelizing data race detection
Found in: Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems (ASPLOS '13)
By Benjamin Wester, David Devecsery, Jason Flinn, Peter M. Chen, Satish Narayanasamy
Issue Date:March 2013
pp. 27-38
Detecting data races in multithreaded programs is a crucial part of debugging such programs, but traditional data race detectors are too slow to use routinely. This paper shows how to speed up race detection by spreading the work across multiple cores. Our...
     
Maple: a coverage-driven testing tool for multithreaded programs
Found in: Proceedings of the ACM international conference on Object oriented programming systems languages and applications (OOPSLA '12)
By Cristiano Pereira, Gilles Pokam, Jie Yu, Satish Narayanasamy
Issue Date:October 2012
pp. 485-502
Testing multithreaded programs is a hard problem, because it is challenging to expose those rare interleavings that can trigger a concurrency bug. We propose a new thread interleaving coverage-driven testing tool called Maple that seeks to expose untested ...
     
End-to-end sequential consistency
Found in: Proceedings of the 39th Annual International Symposium on Computer Architecture (ISCA '12)
By Abhayendra Singh, Daniel Marino, Madanlal Musuvathi, Satish Narayanasamy, Todd Millstein
Issue Date:June 2012
pp. 524-535
Sequential consistency (SC) is arguably the most intuitive behavior for a shared-memory multithreaded program. It is widely accepted that language-level SC could significantly improve programmability of a multiprocessor system. However, efficiently support...
     
DoublePlay: Parallelizing Sequential Logging and Replay
Found in: ACM Transactions on Computer Systems (TOCS)
By Benjamin Wester, Dongyoon Lee, Jason Flinn, Peter M. Chen, Satish Narayanasamy, Jessica Ouyang, Kaushik Veeraraghavan
Issue Date:February 2012
pp. 1-24
Deterministic replay systems record and reproduce the execution of a hardware or software system. In contrast to replaying execution on uniprocessors, deterministic replay on multiprocessors is very challenging to implement efficiently because of the need ...
     
A case for an SC-preserving compiler
Found in: Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation (PLDI '11)
By Abhayendra Singh, Daniel Marino, Madanlal Musuvathi, Satish Narayanasamy, Todd Millstein
Issue Date:June 2011
pp. 123-128
The most intuitive memory consistency model for shared-memory multi-threaded programming is sequential consistency (SC). However, current concurrent programming languages support a relaxed model, as such relaxations are deemed necessary for enabling import...
     
Respec: efficient online multiprocessor replayvia speculation and external determinism
Found in: Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems (ASPLOS '10)
By Benjamin Wester, Dongyoon Lee, Jason Flinn, Kaushik Veeraraghavan, Peter M. Chen, Satish Narayanasamy
Issue Date:March 2010
pp. 222-230
Deterministic replay systems record and reproduce the execution of a hardware or software system. While it is well known how to replay uniprocessor systems, replaying shared memory multiprocessor systems at low overhead on commodity hardware is still an op...
     
Offline symbolic analysis for multi-processor execution replay
Found in: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (Micro-42)
By Cristiano Pereira, Dongyoon Lee, Mahmoud Said, Satish Narayanasamy, Zijiang Yang
Issue Date:December 2009
pp. 564-575
Ability to replay a program's execution on a multi-processor system can significantly help parallel programming. To replay a shared-memory multi-threaded program, existing solutions record its program input (I/O, DMA, etc.) and the shared-memory dependenci...
     
A case for an interleaving constrained shared-memory multi-processor
Found in: Proceedings of the 36th annual international symposium on Computer architecture (ISCA '09)
By Jie Yu, Satish Narayanasamy
Issue Date:June 2009
pp. 70-73
Shared-memory multi-threaded programming is inherently more difficult than single-threaded programming. The main source of complexity is that, the threads of an application can interleave in so many different ways. To ensure correctness, a programmer has t...
     
LiteRace: effective sampling for lightweight data-race detection
Found in: Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation (PLDI '09)
By Daniel Marino, Madanlal Musuvathi, Satish Narayanasamy
Issue Date:June 2009
pp. 1-22
Data races are one of the most common and subtle causes of pernicious concurrency bugs. Static techniques for preventing data races are overly conservative and do not scale well to large programs. Past research has produced several dynamic data race detect...
     
Automatically classifying benign and harmful data racesallusing replay analysis
Found in: Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (PLDI '07)
By Andrew Edwards, Brad Calder, Jordan Tigani, Satish Narayanasamy, Zhenghao Wang
Issue Date:June 2007
pp. 22-31
Many concurrency bugs in multi-threaded programs are due to dataraces. There have been many efforts to develop static and dynamic mechanisms to automatically find the data races. Most of the prior work has focused on finding the data races and eliminating ...
     
Transient fault prediction based on anomalies in processor events
Found in: Proceedings of the conference on Design, automation and test in Europe (DATE '07)
By Ayse K. Coskun, Brad Calder, Satish Narayanasamy
Issue Date:April 2007
pp. 1140-1145
Future microprocessors will be highly susceptible to transient errors as the sizes of transistors decrease due to CMOS scaling. Prior techniques advocated full scale structural or temporal redundancy to achieve fault tolerance. Though they can provide comp...
     
Unbounded page-based transactional memory
Found in: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems (ASPLOS-XII)
By Brad Calder, Ganesh Venkatesh, Gilles Pokam, Jack Sampson, Michael Van Biesbrouck, Osvaldo Colavin, Satish Narayanasamy, Weihaw Chuang
Issue Date:October 2006
pp. 109-es
Exploiting thread level parallelism is paramount in the multicore era. Transactions enable programmers to expose such parallelism by greatly simplifying the multi-threaded programming model. Virtualized transactions (unbounded in space and time) are desira...
     
Recording shared memory dependencies using strata
Found in: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems (ASPLOS-XII)
By Brad Calder, Cristiano Pereira, Satish Narayanasamy
Issue Date:October 2006
pp. 109-es
Significant time is spent by companies trying to reproduce and fix bugs. BugNet and FDR are recent architecture proposals that provide architecture support for deterministic replay debugging. They focus on continuously recording information about the progr...
     
Automatic logging of operating system effects to guide application-level architecture simulation
Found in: Proceedings of the joint international conference on Measurement and modeling of computer systems (SIGMETRICS '06/Performance '06)
By Brad Calder, Cristiano Pereira, Harish Patil, Robert Cohn, Satish Narayanasamy
Issue Date:June 2006
pp. 1928-1929
Modern architecture research relies heavily on application-level detailed pipeline simulation. A time consuming part of building a simulator is correctly emulating the operating system effects, which is required even if the goal is to simulate just the app...
     
 1