Search For:

Displaying 1-33 out of 33 total
MAximum Multicore POwer (MAMPO): an automatic multithreaded synthetic power virus generation framework for multicore systems
Found in: SC Conference
By Karthik Ganesan,Lizy K. John
Issue Date:November 2011
pp. 1-12
The practically attainable worst case power consumption for a computer system is a significant design parameter and it is a very tedious process to determine it by manually writing high power consuming code snippets called power viruses. Previous research ...
 
Autocorrelation analysis: A new and improved method for branch predictability characterization
Found in: IEEE Workload Characterization Symposium
By Jian Chen,Lizy K. John
Issue Date:November 2011
pp. 194-203
Branch predictability characterization not only helps to improve branch prediction but also helps to optimize predicated execution. Branch taken rate and branch transition rate have been proposed to characterize the branch predictability. However, these tw...
 
CantorSim: Simplifying Acceleration of Micro-architecture Simulations
Found in: Modeling, Analysis, and Simulation of Computer Systems, International Symposium on
By Zhibin Yu, Hai Jin, Jian Chen, Lizy K. John
Issue Date:August 2010
pp. 370-377
No summary available.
 
Control Flow Modeling in Statistical Simulation for Accurate and Efficient Processor Design Studies
Found in: Computer Architecture, International Symposium on
By Lieven Eeckhout, Robert H. Bell Jr., Bastiaan Stougie, Koen De Bosschere, Lizy K. John
Issue Date:June 2004
pp. 350
Designing a new microprocessor is extremely time-consuming. One of the contributing reasons is that computer designers rely heavily on detailed architectural simulations, which are very time-consuming. Recent work has focused on statistical simulation to a...
 
Scaling to the End of Silicon with EDGE Architectures
Found in: Computer
By Doug Burger, Stephen W. Keckler, Kathryn S. McKinley, Mike Dahlin, Lizy K. John, Calvin Lin, Charles R. Moore, James Burrill, Robert G. McDonald, William Yoder, the TRIPS Team
Issue Date:July 2004
pp. 44-55
Post-RISC microprocessor designs must introduce new ISAs to address the challenges that modern CMOS technologies pose while also exploiting the massive levels of integration now possible. To meet these challenges, the TRIPS Team at the University of Texas ...
 
Complete System Power Estimation Using Processor Performance Events
Found in: IEEE Transactions on Computers
By W. Lloyd Bircher,Lizy K. John
Issue Date:April 2012
pp. 563-577
This paper proposes the use of microprocessor performance counters for online measurement of complete system power consumption. The approach takes advantage of the
 
Flow Migration on Multicore Network Processors: Load Balancing While Minimizing Packet Reordering
Found in: 2013 42nd International Conference on Parallel Processing (ICPP)
By Muhammad Faisal Iqbal,Jim Holt, Jee Ho Ryoo,Lizy K. John,Gustavo De Veciance
Issue Date:October 2013
pp. 150-159
With ever increasing network traffic rates, multicore architectures for network processors have successfully provided performance improvements through high parallelism. However, naively allocating the network traffic to multiple cores without considering d...
 
Exploring the Application Behavior Space Using Parameterized Synthetic Benchmarks
Found in: Parallel Architectures and Compilation Techniques, International Conference on
By Ajay M. Joshi, Lieven Eeckhout, Lizy K. John
Issue Date:September 2007
pp. 412
Computer architects and researchers face several challenges when using benchmarks in industry product development and academic research, namely: (1) Benchmarks only represent a sample of the application behavior space, (2) Benchmarks are rigid and measure ...
   
Predictive Heterogeneity-Aware Application Scheduling for Chip Multiprocessors
Found in: IEEE Transactions on Computers
By Jian Chen,Arun Arvind Nair,Lizy K. John
Issue Date:February 2014
pp. 435-447
Single-ISA heterogeneous chip multiprocessor (CMP) is not only an attractive design paradigm but also is expected to occur as a consequence of manufacturing imperfections, such as process variation and permanent faults. Process variation could cause cores ...
 
Store-Load-Branch (SLB) predictor: A compiler assisted branch prediction for data dependent branches
Found in: 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA)
By M. Umar Farooq, Khubaib,Lizy K. John
Issue Date:February 2013
pp. 59-70
Data-dependent branches constitute single biggest source of remaining branch mispredictions. Typically, data-dependent branches are associated with program data structures, and follow store-load-branch execution sequence. A set of memory locations is writt...
 
Power and performance analysis of network traffic prediction techniques
Found in: Performance Analysis of Systems and Software, IEEE International Symmposium on
By Muhammad Faisal Iqbal,Lizy K. John
Issue Date:April 2012
pp. 112-113
We study power and performance characteristics of different traffic predictors for online one-step-ahead predictions. The goal is to identify a predictor with reasonable accuracy and low power consumption. Our experiments on a large number of real network ...
 
Hierarchically characterizing CUDA program behavior
Found in: IEEE Workload Characterization Symposium
By Zhibin Yu,Hai Jin,Nilanjan Goswami,Tao Li,Lizy K. John
Issue Date:November 2011
pp. 76
CUDA has become a very popular programming paradigm in parallel computing area. However, very little work has been done for characterizing CUDA kernels. In this work, we measure the thread level performance, collect the basic block level characteristics, a...
 
Elastic Refresh: Techniques to Mitigate Refresh Penalties in High Density Memory
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Jeffrey Stuecheli, Dimitris Kaseridis, Hillery C.Hunter, Lizy K. John
Issue Date:December 2010
pp. 375-384
High density memory is becoming more important as many execution streams are consolidated onto single chip many-core processors. DRAM is ubiquitous as a main memory technology, but while DRAM’s per-chip density and frequency continue to scale, the time req...
 
Applying Statistical Sampling for Fast and Efficient Simulation of Commercial Workloads
Found in: IEEE Transactions on Computers
By Ajay Joshi, Yue Luo, Lizy K. John
Issue Date:November 2007
pp. 1520-1533
<p><b>Abstract</b>—Commercial workloads form an important class of applications and have performance characteristics that are distinct from scientific and technical benchmarks such as SPEC CPU. However, due to the prohibitive simulation t...
 
Simulating Commercial Java Throughput Workloads: A Case Study
Found in: Computer Design, International Conference on
By Yue Luo, Lizy K. John
Issue Date:October 2005
pp. 393-398
<p>We present our study on the simulation methodology for SPECjbb2000. The result shows that CPI can be practically used as a performance metric in place of throughput in the simulation. It is shown that SimPoint can successfully identify phases. Wit...
 
Self-Monitored Adaptive Cache Warm-Up for Microprocessor Simulation
Found in: Computer Architecture and High Performance Computing, Symposium on
By Yue Luo, Lizy K. John, Lieven Eeckhout
Issue Date:October 2004
pp. 10-17
Simulation is the most important tool for computer architects to evaluate the performance of new computer designs. However, detailed simulation is extremely time consuming. Sampling is one of the techniques that effectively reduce simulation time. In order...
 
Improving Server Performance on Transaction Processing Workloads by Enhanced Data Placement
Found in: Computer Architecture and High Performance Computing, Symposium on
By Juan Rubio, Charles Lefurgy, Lizy K. John
Issue Date:October 2004
pp. 84-91
Modern servers access large volumes of data while running commercial workloads. The data is typically spread among several storage devices (e.g. disks). Carefully placing the data across the storage devices can minimize costly remote accesses and improve p...
 
Efficiently Evaluating Speedup Using Sampled Processor Simulation
Found in: IEEE Computer Architecture Letters
By Yue Luo, Lizy K. John
Issue Date:January 2004
pp. N/A
Cycle accurate simulation of processors is extremely time consuming. Sampling can greatly reduce simulation time while retaining good accuracy. Previous research on sampled simulation has been focusing on the accuracy of CPI. However, most simulations are ...
 
Improving Dynamic Cluster Assignment for Clustered Trace Cache Processors
Found in: Computer Architecture, International Symposium on
By Ravi Bhargava, Lizy K. John
Issue Date:June 2003
pp. 264
This work examines dynamic cluster assignment for a clustered trace cache processor (CTCP). Previously proposed cluster assignment techniques run into unique problems as issue width and cluster count increase. Realistic design conditions, such as variable ...
 
Cost-Effective Hardware Acceleration of Multimedia Applications
Found in: Computer Design, International Conference on
By Deependra Talla, Lizy K. John
Issue Date:September 2001
pp. 0415
Abstract: General-purpose microprocessors augmented with SIMD execution units enhance multimedia applications by exploiting data level parallelism. However, supporting/ overhead related instructions (instructions necessary to feed the SIMD execution units ...
 
Evaluating Signal Processing and Multimedia Applications on SIMD, VLIW and Superscalar Architectures
Found in: Computer Design, International Conference on
By Deependra Talla, Lizy K. John, Viktor Lapinskii, Brian L. Evans
Issue Date:September 2000
pp. 163
This paper aims to provide a quantitative understanding of the performance of DSP and multimedia applications on very long instruction word (VLIW), single instruction multiple data (SIMD), and superscalar processors. We evaluate the performance of the VLIW...
 
Novel Memory Bus Driver/Receiver Architecture for Higher Throughput
Found in: VLSI Design, International Conference on
By Gregory E. Beers, Lizy K. John
Issue Date:January 1998
pp. 259
A high speed memory bus interface which enables greater throughput for data reads and writes is described in this paper. Current mode CMOS logic synthesis methods are used to implement multi-valued logic (MVL) functions to create a high bandwidth bus.First...
 
Coordinating DRAM and Last-Level-Cache Policies with the Virtual Write Queue
Found in: IEEE Micro
By Jeffrey Stuecheli, Dimitris Kaseridis, Lizy K. John, David Daly, Hillery C. Hunter
Issue Date:January 2011
pp. 90-98
<p>To alleviate bottlenecks in this era of many-core architectures, the authors propose a virtual write queue to expand the memory controller's scheduling window through visibility of cache behavior. Awareness of the physical main memory layout and a...
 
System-level max power (SYMPO): a systematic approach for escalating system-level power consumption using synthetic benchmarks
Found in: Proceedings of the 19th international conference on Parallel architectures and compilation techniques (PACT '10)
By Dimitris Kaseridis, Jungho Jo, Karthik Ganesan, Lizy K. John, W. Lloyd Bircher, Zhibin Yu
Issue Date:September 2010
pp. 19-28
To effectively design a computer system for the worst case power consumption scenario, system architects often use hand-crafted maximum power consuming benchmarks at the assembly language level. These stressmarks, also called power viruses, are very tediou...
     
The virtual write queue: coordinating DRAM and last-level cache policies
Found in: Proceedings of the 37th annual international symposium on Computer architecture (ISCA '10)
By David Daly, Dimitris Kaseridis, Hillery C. Hunter, Jeffrey Stuecheli, Lizy K. John
Issue Date:June 2010
pp. 72-ff
In computer architecture, caches have primarily been viewed as a means to hide memory latency from the CPU. Cache policies have focused on anticipating the CPU's data needs, and are mostly oblivious to the main memory. In this paper, we demonstrate that th...
     
Efficient program scheduling for heterogeneous multi-core processors
Found in: Proceedings of the 46th Annual Design Automation Conference (DAC '09)
By Jian Chen, Lizy K. John
Issue Date:July 2009
pp. 927-930
Heterogeneous multicore processors promise high execution efficiency under diverse workloads, and program scheduling is critical in exploiting this efficiency. This paper presents a novel method to leverage the inherent characteristics of a program for sch...
     
Distilling the essence of proprietary workloads into miniature benchmarks
Found in: ACM Transactions on Architecture and Code Optimization (TACO)
By Ajay Joshi, Lieven Eeckhout, Lizy K. John, Robert H. Bell
Issue Date:August 2008
pp. 1-33
Benchmarks set standards for innovation in computer architecture research and industry product development. Consequently, it is of paramount importance that these workloads are representative of real-world applications. However, composing such representati...
     
Analysis of dynamic power management on multi-core processors
Found in: Proceedings of the 22nd annual international conference on Supercomputing (ICS '08)
By Lizy K. John, W. Lloyd Bircher
Issue Date:June 2008
pp. 3-3
Power management of multi-core processors is extremely important because it allows power/energy savings when all cores are not used. OS directed power management according to ACPI (Advanced Power and Configurations Interface) specifications is the common a...
     
Analysis of redundancy and application balance in the SPEC CPU2006 benchmark suite
Found in: Proceedings of the 34th annual international symposium on Computer architecture (ISCA '07)
By Aashish Phansalkar, Ajay Joshi, Lizy K. John
Issue Date:June 2007
pp. 412-423
The recently released SPEC CPU2006 benchmark suite is expected to be used by computer designers and computer architecture researchers for pre-silicon early design analysis. Partial use of benchmark suites by researchers, due to simulation time constraints,...
     
Performance prediction based on inherent program similarity
Found in: Proceedings of the 15th international conference on Parallel architectures and compilation techniques (PACT '06)
By Aashish Phansalkar, Andy Georges, Kenneth Hoste, Koen De Bosschere, Lieven Eeckhout, Lizy K. John
Issue Date:September 2006
pp. 114-122
A key challenge in benchmarking is to predict the performance of an application of interest on a number of platforms in order to determine which platform yields the best performance. This paper proposes an approach for doing this. We measure a number of mi...
     
Impact of virtual execution environments on processor energy consumption and hardware adaptation
Found in: Proceedings of the 2nd international conference on Virtual execution environments (VEE '06)
By Lizy K. John, Shiwen Hu
Issue Date:June 2006
pp. 100-110
During recent years, microprocessor energy consumption has been surging and efforts to reduce power and energy have received a lot of attention. At the same time, virtual execution environments (VEEs), such as Java virtual machines, have grown in popularit...
     
Latency and energy aware value prediction for high-frequency processors
Found in: Proceedings of the 16th international conference on Supercomputing (ICS '02)
By Lizy K. John, Ravi Bhargava
Issue Date:June 2002
pp. 45-56
This work addresses the issues of access latency and energy consumption in value predictor design for high-frequency, wide-issue microprocessors. Previous value prediction research allows for generous assumptions regarding table configurations and access c...
     
Improving Java performance using hardware translation
Found in: Proceedings of the 15th international conference on Supercomputing (ICS '01)
By Lizy K. John, Ramesh Radhakrishnan, Ravi Bhargava
Issue Date:June 2001
pp. 427-439
State of the art Java Virtual Machines with Just-In-Time (JIT) compilers make use of advanced compiler techniques, run-time profiling and adaptive compilation to improve performance. However, these techniques for alleviating performance bottlenecks are mor...
     
 1