SOLE: Speculative one-cycle load execution with scalability, high-performance and energy-efficiency
Found in: 2012 IEEE 30th International Conference on Computer Design (ICCD 2012)
By Zhenhao Zhang,Dong Tong,Xiaoyin Wang,Jiangfang Yi,Keyi Wang
Issue Date:September 2012
pp. 291-296
Conventional superscalar processors usually contain large CAM-based LSQ (load/store queue) with poor scalability and high energy consumption. Recently proposals only focus on improving the LSQ scalability to increase the in-flight instruction capacity, but...
Improving inclusive cache performance with two-level eviction priority
Found in: 2012 IEEE 30th International Conference on Computer Design (ICCD 2012)
By Lingda Li,Dong Tong,Zichao Xie,Junlin Lu,Xu Cheng
Issue Date:September 2012
pp. 387-392
Inclusive cache hierarchies are widely adopted in modern processors, since they can simplify the implementation of cache coherence. However, it sacrifices some performance to guarantee inclusion. Many recent intelligent management policies are proposed to ...
S/DC: A storage and energy efficient data prefetcher
Found in: 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE 2012)
By Xianglei Dang, Xiaoyin Wang, Dong Tong, Junlin Lu, Jiangfang Yi, Keyi Wang
Issue Date:March 2012
pp. 461-466
Energy efficiency is becoming a major constraint in processor designs. Every component of the processor should be reconsidered to reduce wasted energy and area. Prefetching is an important technique for tolerating memory latency. Prefetcher designs have im...
TAP prediction: Reusing conditional branch predictor for indirect branches with Target Address Pointers
Found in: Computer Design, International Conference on
By Zichao Xie,Dong Tong,Mingkai Huang,Xiaoyin Wang,Qinqing Shi,Xu Cheng
Issue Date:October 2011
pp. 119-126
Indirect-branch prediction is becoming more important for modern processors as more programs are written in object-oriented languages. Previous hardware-based indirect-branch predictors generally require significant hardware storage or use aggressive algor...
The Remote Monitor and Control System for Motion Controller Based on the Ethernet
Found in: Intelligent System Design and Engineering Application, International Conference on
By Wu Xiao-jun, Sun Shu-dong, Tong Zhi-xue, Hao Bo-yang
Issue Date:October 2010
pp. 717-719
In order to realize remote debugging and diagnostics to the motion controller with Ethernet, equipment with SIMOTION motion control system is applied. The application of virtual private network technology, combined with SCOUT tool, a remote control through...
Using Lossless Data Compression in Data Storage Systems: Not for Saving Space
Found in: IEEE Transactions on Computers
By Ningde Xie, Guiqiang Dong, Tong Zhang
Issue Date:March 2011
pp. 335-345
Lossless data compression for data storage has become less popular as mass data storage systems are becoming increasingly cheap. This leaves many files stored on mass data storage media uncompressed although they are losslessly compressible. This paper pro...
In-Situ Test of Pile-Soil Stress Ratio of CFG Pile-Net Composite Foundation in Beijing-Shanghai High-Speed Railway
Found in: Intelligent Computation Technology and Automation, International Conference on
By Jun-cheng Zeng, Ji-wen Zhang, Yong-ming Tu, Xiao-dong Tong
Issue Date:October 2009
pp. 638-640
Pile-soil stress ratio is one of important parameters in the design of high-speed railway composite foundation. The full scale in-situ test in trial section of Beijing-Shanghai high-speed railway by adopting composite foundation of CFG pile-net (geogrid) w...
Track Down HW Function Faults Using Real SW Invariants
Found in: Computer Science and Information Engineering, World Congress on
By Yansong Zheng, Dong Tong, Hao Li, Keyi Wang, Xu Cheng
Issue Date:April 2009
pp. 248-253
System level functional verification by running real software stack on FPGA prototype is essential for achieving a high quality design. But it is hard to find the exact source of hardware function faults while running large closed source system software fa...
Semantic-based Data Source Discovery for DAI
Found in: Computer and Computational Sciences, International Multi-Symposiums on
By Guoqing Dong, Tong Weiqin
Issue Date:August 2007
pp. 253-257
The problem of data source discovery is a key issue in a data access and integration (DAI) system in the grid environment. This problem involves assigning data sources to tasks in order to satisfy task requirements and data source policies. These requireme...
Improving system throughput and fairness simultaneously in shared memory CMP systems via Dynamic Bank Partitioning
Found in: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)
By Mingli Xie,Dong Tong,Kan Huang,Xu Cheng
Issue Date:February 2014
pp. 344-355
Applications running concurrently in CMP systems interfere with each other at DRAM memory, leading to poor system performance and fairness. Memory access scheduling reorders memory requests to improve system throughput and fairness. However, it cannot reso...
Energy-efficient branch prediction with Compiler-guided History Stack
Found in: 2012 Design, Automation & Test in Europe Conference & Exhibition (DATE 2012)
By Mingxing Tan, Xianhua Liu, Zichao Xie, Dong Tong, Xu Cheng
Issue Date:March 2012
pp. 449-454
Branch prediction is critical in exploring instruction level parallelism for modern processors. Previous aggressive branch predictors generally require significant amount of hardware storage and complexity to pursue high prediction accuracy. This paper pro...
SPIRE: improving dynamic binary translation through SPC-indexed indirect branch redirecting
Found in: Proceedings of the 9th ACM SIGPLAN/SIGOPS international conference on Virtual execution environments (VEE '13)
By Chun Yang, Dong Tong, Jing Wang, Keyi Wang, Ning Jia
Issue Date:March 2013
pp. 1-12
Dynamic binary translation system must perform an address translation for every execution of indirect branch instructions. The procedure to convert Source binary Program Counter (SPC) address to Translated Program Counter (TPC) address always takes more th...
Optimal bypass monitor for high performance last-level caches
Found in: Proceedings of the 21st international conference on Parallel architectures and compilation techniques (PACT '12)
By Dong Tong, Junlin Lu, Lingda Li, Xu Cheng, Zichao Xie
Issue Date:September 2012
pp. 315-324
In the last-level cache, large amounts of blocks have reuse distances greater than the available cache capacity. Cache performance and efficiency can be improved if some subset of these distant reuse blocks can reside in the cache longer. The bypass techni...
FEMU: a firmware-based emulation framework for SoC verification
Found in: Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis (CODES/ISSS '10)
By Dong Tong, Hao Li, Kan Huang, Xu Cheng
Issue Date:October 2010
pp. 257-266
Full-system emulation on FPGA(Field-Programmable Gate Array) with real-world workloads can enhance the confidence of SoC(System-on-Chip) design. However, since FPGA emulation requires complete implementation of key modules and provides weak visibility, it ...
FPGA prototyping of an amba-based windows-compatible SoC
Found in: Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays (FPGA '10)
By Dong Tong, Hao Li, Jiufeng Pang, Junlin Lu, Kan Huang, Xu Cheng, Yansong Zheng
Issue Date:February 2010
pp. 13-22
For the increasing market of smart phones, mobile internet devices, and ultra-mobile PCs, mainstream vendors propose two approaches: one is based on ARM SoC, and the other is based on power-efficient x86 processor. However, either approach has its own limi...
Clock domain crossing fault model and coverage metric for validation of SoC design
Found in: Proceedings of the conference on Design, automation and test in Europe (DATE '07)
By Dong Tong, Xu Cheng, Yi Feng, Zheng Zhou
Issue Date:April 2007
pp. 1385-1390
Multiple asynchronous clock domains have been increasingly employed in System-on-Chip (SoC) designs for different I/O interfaces. Functional validation is one of the most expensive tasks in the SoC design process. Simulation on register transfer level (RTL...