Search For:

Displaying 1-16 out of 16 total
Ispike: A Post-link Optimizer for the Intel®Itanium®Architecture
Found in: Code Generation and Optimization, IEEE/ACM International Symposium on
By Chi-Keung Luk, Robert Muth, Harish Patil, Robert Cohn, Geoff Lowney
Issue Date:March 2004
pp. 15
Ispike is post-link optimizer developed for the Intel?Itanium Processor Family (IPF) processors. The IPF architecture poses both opportunities and challenges to post-link optimizations. IPF offers a rich set of performance counters to collect detailed prof...
 
Portable trace compression through instruction interpretation
Found in: Performance Analysis of Systems and Software, IEEE International Symmposium on
By Svilen Kanev, Robert Cohn
Issue Date:April 2011
pp. 107-116
Execution traces are a useful tool in studying processor and program behavior. However, the amount of information that needs to be stored makes them impractical in uncompressed form. This is especially true for full-state traces that can capture up to kilo...
 
Selecting Operator Queries Using Expected Myopic Gain
Found in: Web Intelligence and Intelligent Agent Technology, IEEE/WIC/ACM International Conference on
By Robert Cohn, Michael Maxim, Edmund Durfee, Satinder Singh
Issue Date:September 2010
pp. 40-47
When its human operator cannot continuously supervise (much less teleoperate) an agent, the agent should be able to recognize its limitations and ask for help when it risks making autonomous decisions that could significantly surprise and disappoint the op...
 
Analyzing Parallel Programs with Pin
Found in: Computer
By Moshe (Maury) Bach, Mark Charney, Robert Cohn, Elena Demikhovsky, Tevi Devor, Kim Hazelwood, Aamer Jaleel, Chi-Keung Luk, Gail Lyons, Harish Patil, Ady Tal
Issue Date:March 2010
pp. 34-41
No summary available.
 
Virtual Program Counter (VPC) Prediction: Very Low Cost Indirect Branch Prediction Using Conditional Branch Prediction Hardware
Found in: IEEE Transactions on Computers
By Hyesoon Kim, José A. Joao, Onur Mutlu, Chang Joo Lee, Yale N. Patt, Robert Cohn
Issue Date:September 2009
pp. 1153-1170
Indirect branches have become increasingly common in modular programs written in modern object-oriented languages and virtual-machine-based runtime systems. Unfortunately, the prediction accuracy of indirect branches has not improved as much as that of con...
 
Persistent Code Caching: Exploiting Code Reuse Across Executions and Applications
Found in: Code Generation and Optimization, IEEE/ACM International Symposium on
By Vijay Janapa Reddi, Dan Connors, Robert Cohn, Michael D. Smith
Issue Date:March 2007
pp. 74-88
Run-time compilation systems are challenged with the task of translating a program?s instruction stream while maintaining low overhead. While software managed code caches are utilized to amortize translation costs, they are ineffective for programs with sh...
 
A Cross-Architectural Interface for Code Cache Manipulation
Found in: Code Generation and Optimization, IEEE/ACM International Symposium on
By Kim Hazelwood, Robert Cohn
Issue Date:March 2006
pp. 17-27
<p>Software code caches help amortize the overhead of dynamic binary transformation by enabling reuse of transformed code. Since code caches contain a potentiallyaltered copy of every instruction that executes, run-time access to a code cache can be ...
 
Pinpointing Representative Portions of Large Intel? Itanium? Programs with Dynamic Instrumentation
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Harish Patil, Robert Cohn, Mark Charney, Rajiv Kapoor, Andrew Sun, Anand Karunanidhi
Issue Date:December 2004
pp. 81-92
Detailed modeling of the performance of commercial applications is difficult. The applications can take a very long time to run on real hardware and it is impractical to simulate them to completion on performance models. Furthermore, these applications hav...
 
Code Layout Optimizations for Transaction Processing Workloads
Found in: Computer Architecture, International Symposium on
By Alex Ramirez, Josep Larriba-Pey, Mateo Valero, Luiz André Barroso, Kourosh Gharachorloo, Robert Cohn, P. Geoffrey Lowney
Issue Date:July 2001
pp. 0155
Abstract: Commercial applications such as databases and Web servers constitute the most important market segment for high-performance servers. Among these applications, on-line transaction processing (OLTP) workloads provide a challenging set of requiremen...
 
Pin: building customized program analysis tools with dynamic instrumentation
Found in: Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation (PLDI '05)
By Artur Klauser, Chi-Keung Luk, Geoff Lowney, Harish Patil, Kim Hazelwood, Robert Cohn, Robert Muth, Steven Wallace, Vijay Janapa Reddi
Issue Date:June 2005
pp. 280-280
Robust and powerful software instrumentation tools are essential for program analysis tasks such as profiling, performance evaluation, and bug detection. To meet this need, we have developed a new instrumentation system called Pin. Our goals are to provide...
     
Profile-guided post-link stride prefetching
Found in: Proceedings of the 16th international conference on Supercomputing (ICS '02)
By Chi-Keung Luk, Harish Patil, P. Geoffrey Lowney, Richard Weiss, Robert Cohn, Robert Muth
Issue Date:June 2002
pp. 167-178
Data prefetching is an effective approach to addressing the memory latency problem. While a few processors have implemented hardware-based data prefetching, the majority of modern processors support data-prefetch instructions and rely on compilers to autom...
     
Scalable support for multithreaded applications on dynamic binary instrumentation systems
Found in: Proceedings of the 2009 international symposium on Memory management (ISMM '09)
By Greg Lueck, Kim Hazelwood, Robert Cohn
Issue Date:June 2009
pp. 70-73
Dynamic binary instrumentation systems are used to inject or modify arbitrary instructions in existing binary applications; several such systems have been developed over the past decade. Much of the literature describing the internal architecture and perfo...
     
Automatic logging of operating system effects to guide application-level architecture simulation
Found in: Proceedings of the joint international conference on Measurement and modeling of computer systems (SIGMETRICS '06/Performance '06)
By Brad Calder, Cristiano Pereira, Harish Patil, Robert Cohn, Satish Narayanasamy
Issue Date:June 2006
pp. 1928-1929
Modern architecture research relies heavily on application-level detailed pipeline simulation. A time consuming part of building a simulator is correctly emulating the operating system effects, which is required even if the goal is to simulate just the app...
     
Source level debugging of automatically parallelized code
Found in: Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging (PADD '91)
By Robert Cohn
Issue Date:May 1991
pp. 109-116
We describe a novel approach to the design of portable integrated debugging tools for concurrent languages. Our design partitions the tools set into two categories. The language specific tools take into account the particular features of a programming lang...
     
Supporting systolic and memory communication in iWarp
Found in: Proceedings of the 17th annual international symposium on Computer Architecture (ISCA '90)
By Brian Moore, Craig Peterson, George Cox, H. T. Kung, Jim Susman, Jim Sutton, John Urbanski, Jon Webb, Margie Levine, Monica Lam, Robert Cohn, Shekhar Borkar, Thomas Gross, Wire Moore
Issue Date:May 1990
pp. 309-319
iWarp is a parallel architecture developed jointly by Carnegie Mellon University and Intel Corporation. The iWarp communication system supports two widely used interprocessor communication styles: memory communication and systolic communication. This paper...
     
Architecture and compiler tradeoffs for a long instruction wordprocessor
Found in: Proceedings of the third international conference on Architectural support for programming languages and operating systems (ASPLOS-III)
By Monica Lam, Robert Cohn, Thomas Gross
Issue Date:April 1989
pp. 205-209
A very long instruction word (VLIW) processor exploits parallelism by controlling multiple operations in a single instruction word. This paper describes the architecture and compiler tradeoffs in the design of iWarp, a VLIW single-chip microprocessor devel...
     
 1