Search For:

Displaying 1-27 out of 27 total
LDet: Determinizing Asynchronous Transfer for Postsilicon Debugging
Found in: IEEE Transactions on Computers
By Yunji Chen,Tianshi Chen,Ling Li,Lei Li,Liang Yang,Menghao Su,Weiwu Hu
Issue Date:September 2013
pp. 1732-1744
To efficiently and effectively debug silicon bugs, a promising solution is to determinize the chip, so that the buggy silicon behaviors can be faithfully reproduced on a RTL simulator. In this paper, we propose a novel scheme, named LDet, to determinize a ...
 
Program Regularization in Memory Consistency Verification
Found in: IEEE Transactions on Parallel and Distributed Systems
By Yunji Chen,Lei Li,Tianshi Chen,Ling Li,Lei Wang,Xiaoxue Feng,Weiwu Hu
Issue Date:November 2012
pp. 2163-2174
A widely adopted methodology for verifying the memory subsystem of a Chip Multiprocessor (CMP) is to verify executions of parallel test programs on the CMP against the given memory consistency model, which has been long known to be time consuming in both t...
 
BenchNN: On the broad potential application scope of hardware neural network accelerators
Found in: 2012 IEEE International Symposium on Workload Characterization (IISWC)
By Tianshi Chen,Yunji Chen,Marc Duranton,Qi Guo,Atif Hashmi,Mikko Lipasti,Andrew Nere,Shi Qiu,Michele Sebag,Olivier Temam
Issue Date:November 2012
pp. 36-45
Recent technology trends have indicated that, although device sizes will continue to scale as they have in the past, supply voltage scaling has ended. As a result, future chips can no longer rely on simply increasing the operational core count to improve p...
 
Statistical performance comparisons of computers
Found in: High-Performance Computer Architecture, International Symposium on
By Tianshi Chen,Yunji Chen,Qi Guo,Olivier Temam,Yue Wu,Weiwu Hu
Issue Date:February 2012
pp. 1-12
As a fundamental task in computer architecture research, performance comparison has been continuously hampered by the variability of computer performance. In traditional performance comparisons, the impact of performance variability is usually ignored (i.e...
 
Review for Intensity Inhomogeneity Estimate Method
Found in: Information and Computing Science, International Conference on
By Yunjie Chen, Jianwei Zhang, Wenbing Chen, Jianwei Yang
Issue Date:April 2011
pp. 114-117
Intensity in homogeneity is often encountered in MR imaging, and a number of techniques have been devised to correct this artifact. This paper attempts to review some of the recent developments in the mathematical methods of Intensity in homogeneity estima...
 
A Novel Extracting Blob-like Object Method Based on Scale-Space
Found in: Information and Computing Science, International Conference on
By Wenbing Chen, Xia Wang, Qizhou Li, Yunjie Chen
Issue Date:April 2011
pp. 106-109
This paper presents a novel method, which can be used to extract blob-like object from a blob-like image. The method firstly uses an interest point detecting algorithm with automation scale selection to detect interest points and their scales. Secondly, ce...
 
Empirical design bugs prediction for verification
Found in: 2011 Design, Automation & Test in Europe
By Qi Guo, Tianshi Chen, Haihua Shen, Yunji Chen, Yue Wu, Weiwu Hu
Issue Date:March 2011
pp. 1-6
Coverage model is the main technique to evaluate the thoroughness of dynamic verification of a Design-under-Verification (DUV). However, rather than achieving a high coverage, the essential purpose of verification is to expose as many bugs as possible. In ...
   
Video Encoding without Integer-Pel Motion Estimation
Found in: Data Compression Conference
By Shaoli Liu, Ling Li, Yunji Chen, Tianshi Chen
Issue Date:March 2011
pp. 469
Traditionally, motion estimation (ME) consists of three main steps, including spatial-temporal prediction, integer-pel ME and fractional-pel ME. Integer-pel ME, which searches the integer-pel reference position with a low encoding cost, was considered to b...
 
Linear Time Memory Consistency Verification
Found in: IEEE Transactions on Computers
By Weiwu Hu,Yunji Chen,Tianshi Chen,Cheng Qian,Lei Li
Issue Date:April 2012
pp. 502-516
Verifying the execution of a parallel program against a given memory consistency model (memory consistency verification) is a crucial problem in the functional validation of Chip Multiprocessor (CMP). In the absence of additional information, the above pro...
 
On-the-Fly Reduction of Stimuli for Functional Verification
Found in: Asian Test Symposium
By Qi Guo, Tianshi Chen, Haihua Shen, Yunji Chen, Weiwu Hu
Issue Date:December 2010
pp. 448-454
As a primary method for functional verification of microprocessors, simulation-based verification has received extensive studies over the last decade. Most investigations have been dedicated to the generation of stimuli (test cases), while relatively few h...
 
Coverage Directed Test Generation: Godson Experience
Found in: Asian Test Symposium
By Haihua Shen, Wenli Wei, Yunji Chen, Bowen Chen, Qi Guo
Issue Date:November 2008
pp. 321-326
Biased random test generation is one of the most important methods for the verification of modern complex processors. As the complexity of processors grows, the bottleneck remains in generating suitable test programs that meet coverage metrics automaticall...
 
An Enhanced HyperTransport Controller with Cache Coherence Support for Multiple-CMP
Found in: Networking, Architecture, and Storage, International Conference on
By Huandong Wang, Dan Tang, Xiang Gao, Yunji Chen
Issue Date:July 2009
pp. 215-218
HyperTransport link is a high performance IO interface for system connection. In this paper, the architecture of a HyperTransport interface is introduced.This HyperTransport interface realizes efficient HT-AXI bidirectional transformation, where AXI is a p...
 
Efficiency-Aware QoS DRAM Scheduler
Found in: Networking, Architecture, and Storage, International Conference on
By Menghao Su, Xiang Gao, Yunji Chen, Qi Liu, Longbing Zhang
Issue Date:July 2009
pp. 223-226
For most SoCs, off-chip DRAM is an important resource that is shared by many heterogeneous function units(FU).To meet different memory access requirements by these FUs,it is crucial that the memory subsystem is capable of providing different Quality of Ser...
 
Designing an Effective Constraint Solver in Coverage Directed Test Generation
Found in: Embedded Software and Systems, Second International Conference on
By Haihua Shen, Pengyu Wang, Yunji Chen, Qi Guo, Heng Zhang
Issue Date:May 2009
pp. 388-395
As the complexity of processors grows, the bottleneck of verification remains in generating suitable test programs that meet coverage metrics automatically. Coverage directed test generation is a technique to automate the feedback from coverage analysis to...
 
Godson-3: A Scalable Multicore RISC Processor with x86 Emulation
Found in: IEEE Micro
By Weiwu Hu, Jian Wang, Xiang Gao, Yunji Chen, Qi Liu, Guojie Li
Issue Date:March 2009
pp. 17-29
<p>The Godson-3 microprocessor aims at high-throughput server applications, high-performance scientific computing, and high-end embedded applications. It offers a scalable network on chip, hardware support for x86 emulation, and a reconfigurable arch...
 
Chinese Visible Human Brain Image Segmentation
Found in: Image and Signal Processing, Congress on
By Yunjie Chen, Jianwei Zhang, Ann Heng Pheng, Deshen Xia
Issue Date:May 2008
pp. 639-643
The Visible Human data set provides researchers with digital cross-sections of the human body. Many institutions use the Visible Human for research and teaching purposes. In this paper, we would like to share our experience in analyse the Chinese Visible H...
 
FreeRider: Non-local Adaptive Network-on-Chip Routing with Packet-CarriedPropagation of Congestion Information
Found in: IEEE Transactions on Parallel and Distributed Systems
By Shaoli Liu,Tianshi Chen,Ling Li,Xi Li,Mingzhe Zhang,Chao Wang,Haibo Meng,Xuehai Zhou,Yunji Chen
Issue Date:August 2014
pp. 1
Non-local adaptive routing techniques, which utilize statuses of both local and distant links to make routing decisions, have recently been shown to be effective solutions for promoting the performance of Network-on-Chip (NoC). The essence of non-local ada...
 
ArchRanker: A ranking approach to design space exploration
Found in: 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA)
By Tianshi Chen,Qi Guo,Ke Tang,Olivier Temam,Zhiwei Xu,Zhi-Hua Zhou,Yunji Chen
Issue Date:June 2014
pp. 85-96
Architectural Design Space Exploration (DSE) is a notoriously difficult problem due to the exponentially large size of the design space and long simulation times. Previously, many studies proposed to formulate DSE as a regression problem which predicts arc...
   
Architecture Support for Task Out-of-order Execution in MPSoCs
Found in: IEEE Transactions on Computers
By Chao Wang,Xi Li,Junneng Zhang,Peng Chen,Yunji Chen,Xuehai Zhou,Ray Cheung
Issue Date:April 2014
pp. 1
Multi-processor system on chip (MPSoC) has been widely applied in embedded systems in the past decades. However, it has posed great challenges to efficiently design and implement a rapid prototype for diverse applications due to heterogeneous instruction s...
   
Statistical Performance Comparisons of Computers
Found in: IEEE Transactions on Computers
By Tianshi Chen,Qi Guo,Olivier Temam,Yue Wu,Yungang Bao,Zhiwei Xu,Yunji Chen
Issue Date:April 2014
pp. 1
As a fundamental task in computer architecture research, performance comparison has been continuously hampered by the variability of computer performance. In traditional performance comparisons, the impact of performance variability is usually ignored (i.e...
 
Performance Prediction for Reconfigurable Processor
Found in: 2012 IEEE 14th Int'l Conf. on High Performance Computing and Communication (HPCC) & 2012 IEEE 9th Int'l Conf. on Embedded Software and Systems (ICESS)
By Daofu Liu,Qi Guo,Tianshi Chen,Ling Li,Yunji Chen
Issue Date:June 2012
pp. 1352-1359
As the Integrated Circuit (IC) process improves, the microprocessors become more and more complicated. Most microprocessors allow part of their important parameters to be reconfigured, such as the frequency, cache prefetch mechanism, and so on. Predicting ...
 
DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning
Found in: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems (ASPLOS '14)
By Chengyong Wu, Jia Wang, Ninghui Sun, Olivier Temam, Tianshi Chen, Yunji Chen, Zidong Du
Issue Date:March 2014
pp. 269-284
Machine-Learning tasks are becoming pervasive in a broad range of domains, and in a broad range of systems (from embedded systems to data centers). At the same time, a small set of machine-learning algorithms (especially Convolutional and Deep Neural Netwo...
     
Effective and efficient microprocessor design space exploration using unlabeled design configurations
Found in: ACM Transactions on Intelligent Systems and Technology (TIST)
By Ling Li, Qi Guo, Tianshi Chen, Yunji Chen, Zhi-Hua Zhou, Zhiwei Xu
Issue Date:December 2013
pp. 1-18
Ever-increasing design complexity and advances of technology impose great challenges on the design of modern microprocessors. One such challenge is to determine promising microprocessor configurations to meet specific design constraints, which is called De...
     
Brief announcement: program regularization in verifying memory consistency
Found in: Proceedings of the 23rd ACM symposium on Parallelism in algorithms and architectures (SPAA '11)
By Cheng Qian, Lei Li, Ling Li, Tianshi Chen, Weiwu Hu, Yunji Chen
Issue Date:June 2011
pp. 265-266
Verifying memory consistency, which is to verify the executions of parallel test programs on a multiprocessor system against the given memory consistency model, is NP-hard. To accelerate verifying memory consistency in practice, we devise a technique calle...
     
LReplay: a pending period based deterministic replay scheme
Found in: Proceedings of the 37th annual international symposium on Computer architecture (ISCA '10)
By Ruiyang Wu, Tianshi Chen, Weiwu Hu, Yunji Chen
Issue Date:June 2010
pp. 72-ff
Debugging parallel program is a well-known difficult problem. A promising method to facilitate debugging parallel program is using hardware support to achieve deterministic replay. A hardware-assisted deterministic replay scheme should have a small log siz...
     
Deterministic Replay Using Global Clock
Found in: ACM Transactions on Architecture and Code Optimization (TACO)
By Weiwu Hu, Yunji Chen
Issue Date:April 2013
pp. 1-28
Debugging parallel programs is a well-known difficult problem. A promising method to facilitate debugging parallel programs is using hardware support to achieve deterministic replay on a Chip Multi-Processor (CMP). As a Design-For-Debug (DFD) feature, a pr...
     
A multi-FPGA based platform for emulating a 100m-transistor-scale processor with high-speed peripherals (abstract only)
Found in: Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays (FPGA '10)
By Dan Tang, Huandong Wang, Weiwu Hu, Xiang Gao, Yunji Chen
Issue Date:February 2010
pp. 283-283
This paper describes a multi-FPGA based platform for emulating the Loongson-2G micro-processor on different mother boards. This platform is developed targeting at verification and evaluation of the Loongson-2G micro-processor, which is the next generation ...
     
 1