Search For:

Displaying 1-9 out of 9 total
Auto-Tuning GEMV on Many-Core GPU
Found in: 2012 IEEE 18th International Conference on Parallel and Distributed Systems (ICPADS)
By Weizhi Xu,Zhiyong Liu,Jun Wu,Xiaochun Ye,Shuai Jiao,Da Wang,Fenglong Song,Dongrui Fan
Issue Date:December 2012
pp. 30-36
GPUs provide powerful computing ability especially for data parallel algorithms. However, the complexity of the GPU system makes the optimization of even a simple algorithm difficult. Different parallel algorithms or optimization methods on a GPU often lea...
 
ALWP: A Workload Partition Method for the Efficient Parallel Simulation of Manycores
Found in: 2012 IEEE 14th Int'l Conf. on High Performance Computing and Communication (HPCC) & 2012 IEEE 9th Int'l Conf. on Embedded Software and Systems (ICESS)
By Shuai Jiao,Da Wang,Xiaochun Ye,Weizhi Xu,Hao Zhang,and Ninghui Sun
Issue Date:June 2012
pp. 135-142
this paper addresses the workload partition strategies in simulating many-core architectures. The key observation behind this paper is: compared to multicore, manycore features with more non-uniform memory access and unpredictable network traffic; these fe...
 
PartitionSim: A Parallel Simulator for Many-cores
Found in: 2012 IEEE 14th Int'l Conf. on High Performance Computing and Communication (HPCC) & 2012 IEEE 9th Int'l Conf. on Embedded Software and Systems (ICESS)
By Shuai Jiao,Da Wang,Xiaochun Ye,Weizhi Xu,Hao Zhang,Ninghui Sun
Issue Date:June 2012
pp. 119-126
This paper introduces PartitionSim, a parallel simulator for future thousand-core processors. The purpose of PartitionSim is to improve the simulation performance of many-core architectures at the expense of little accuracy sacrifice. To achieve this goal,...
 
Godson-T: An Efficient Many-Core Processor Exploring Thread-Level Parallelism
Found in: IEEE Micro
By Dongrui Fan,Hao Zhang,Da Wang,Xiaochun Ye,Fenglong Song,Guojie Li,Ninghui Sun
Publication Date: April 2012
pp. N/A
Godson-T is a research many-core processor designed for parallel scientific computing. It delivers efficient performance and flexible programmability simultaneously. On the one hand, Godson-T has many features to achieve high efficiency for on-chip resourc...
 
Godson-T: An Efficient Many-Core Processor Exploring Thread-Level Parallelism
Found in: IEEE Micro
By Dongrui Fan,Hao Zhang,Da Wang,Xiaochun Ye,Fenglong Song,Guojie Li,Ninghui Sun
Issue Date:March 2012
pp. 38-47
Godson-T is a research many-core processor designed for parallel scientific computing that delivers efficient performance and flexible programmability simultaneously. It also has many features to achieve high efficiency for on-chip resource utilization, su...
 
A Fast Linear-Space Sequence Alignment Algorithm with Dynamic Parallelization Framework
Found in: Computer and Information Technology, International Conference on
By Xiaochun Ye, Dongrui Fan, Wei Lin
Issue Date:October 2009
pp. 274-279
Exact pairwise sequence alignment algorithms using dynamic programming require quadratic space and time, and this makes these algorithms impractical for large-scale sequences. In this paper, we propose and evaluate a new Anti-Diagonal based Parallel Linear...
 
A Low-Complexity Synchronization Based Cache Coherence Solution for Many Cores
Found in: Computer and Information Technology, International Conference on
By Wei Lin, DongRui Fan, He Huang, Nan Yuan, XiaoChun Ye
Issue Date:October 2009
pp. 69-75
Computer architectures make a dramatic turn away from improving single-processor performance towards improved parallel performance through integrating many cores in one chip. However, providing directory based coherence protocols for these platforms is too...
 
Efficient Parallelization of a Protein Sequence Comparison Algorithm on Manycore Architecture
Found in: Parallel and Distributed Computing Applications and Technologies, International Conference on
By Xiaochun Ye, Van Hoa Nguyen, Dominique Lavenier, Dongrui Fan
Issue Date:December 2008
pp. 167-170
This paper introduces the Godson-T manycore architecture and demonstrates the efficiency of its synchronization mechanism through a computation intensive bioinformatics application: the comparison of protein banks. The parallel part of the protein sequence...
 
A Path-Adaptive Opto-electronic Hybrid NoC for Chip Multi-processor
Found in: 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)
By Mingzhe Zhang, Da Wang, Xiaochun Ye, Liqiang He, Dongrui Fan, Zhiyong Liu
Issue Date:July 2013
pp. 1198-1205
The continuous development of manufacture allows to integrate optical components in a chip, which providing a feasible solution for the communication between the cores in manycore processors. Considering the limitation of manufacture technology and the cha...
 
 1