Search For:

Displaying 1-41 out of 41 total
Clumsy Flow Control for High-Throughput Bufferless On-Chip Networks
Found in: IEEE Computer Architecture Letters
By Hanjoon Kim,Yonggon Kim,John Kim
Issue Date:July 2013
pp. 47-50
Bufferless on-chip networks are an alternative type of on-chip network organization that can improve the cost-efficiency of an on-chip network by removing router input buffers. However, bufferless on-chip network performance degrades at high load because o...
 
Scalable on-chip network in power constrained manycore processors
Found in: 2012 International Green Computing Conference (IGCC)
By Hanjoon Kim,Gwangsun Kim,John Kim
Issue Date:June 2012
pp. 1-2
While much research has been done using 2D mesh network as a baseline on-chip network topology, recent multi-core chips from vendors leverage a ring topology. In this work, we re-visit the topology comparison in on-chip networks and model the impact of on-...
 
Design of Interconnection Networks
Found in: High-Performance Interconnects, Symposium on
By Dennis Abts Cray, John Kim
Issue Date:August 2007
pp. 12
Digital systems of all types are rapidly becoming communication limited. Movement of data, not arithmetic or control logic, is the factor limiting cost, performance, size, and power in these systems. Historically used only in high-end supercomputers and te...
   
Transportation-network-inspired network-on-chip
Found in: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)
By Hanjoon Kim,Gwangsun Kim,Seungryoul Maeng,Hwasoo Yeo,John Kim
Issue Date:February 2014
pp. 332-343
A cost-efficient network-on-chip is needed in a scalable many-core systems. Recent multicore processors have leveraged a ring topology and hierarchical ring can increase scalability but presents different challenges, including higher hop count and global r...
   
Memory-centric system interconnect design with Hybrid Memory Cubes
Found in: 2013 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT)
By Gwangsun Kim,John Kim,Jung Ho Ahn,Jaeha Kim
Issue Date:September 2013
pp. 145-155
Memory bandwidth has been one of the most critical system performance bottlenecks. As a result, the HMC (Hybrid Memory Cube) has recently been proposed to improve DRAM bandwidth as well as energy efficiency. In this paper, we explore different system inter...
   
Low-Overhead Network-on-Chip Support for Location-Oblivious Task Placement
Found in: IEEE Transactions on Computers
By Gwangsun Kim,Michael Mihn-Jong Lee,John Kim,Jae W. Lee,Dennis Abts,Michael Marty
Issue Date:June 2014
pp. 1487-1500
Many-core processors will have many processing cores with a network-on-chip (NoC) that provides access to shared resources such as main memory and on-chip caches. However, locally-fair arbitration in multi-stage NoC can lead to globally unfair access to sh...
   
Providing cost-effective on-chip network bandwidth in GPGPUs
Found in: 2012 IEEE 30th International Conference on Computer Design (ICCD 2012)
By Hanjoon Kim,John Kim,Woong Seo,Yeongon Cho,Soojung Ryu
Issue Date:September 2012
pp. 407-412
Network-on-chip (NoC) bandwidth has a significant impact on overall performance in throughput-oriented processors such as GPG-PUs. Although it has been commonly assumed that high NoC bandwidth can be provided through abundant on-chip wires, we show that in...
 
An Alternative Memory Access Scheduling in Manycore Accelerators
Found in: Parallel Architectures and Compilation Techniques, International Conference on
By Yonggon Kim,Hyunseok Lee,John Kim
Issue Date:October 2011
pp. 195-196
Memory controllers in graphics processing units (GPU) often employ out-of-order scheduling to maximize row access locality. However, this requires complex logic to enable out-of-order scheduling compared with in-order scheduling. To provide a low-cost and ...
 
Exploiting Mutual Awareness between Prefetchers and On-chip Networks in Multi-cores
Found in: Parallel Architectures and Compilation Techniques, International Conference on
By Junghoon Lee,Minjeong Shin,Hanjoon Kim,John Kim,Jaehyuk Huh
Issue Date:October 2011
pp. 177-178
The unique characteristics of prefetch traffic have not been considered in on-chip network design for multicore architectures. Most prefetchers are often oblivious to the network congestion when generating prefetech requests. In this work, we investigate t...
 
On-Chip Network Evaluation Framework
Found in: SC Conference
By Hanjoon Kim, Seulki Heo, Junghoon Lee, Jaehyuk Huh, John Kim
Issue Date:November 2010
pp. 10
With the number of cores on a chip continuing to increase, proper evaluation of on-chip network is critical for not only network performance but also overall system performance. In this paper, we show how a network-only simulation can be limited as it does...
 
Cost-Efficient Dragonfly Topology for Large-Scale Systems
Found in: IEEE Micro
By John Kim, William Dally, Steve Scott, Dennis Abts
Issue Date:January 2009
pp. 33-40
<p>It is more efficient to use increasing pin bandwidth by creating high-radix routers with a large number of narrow ports instead of low-radix routers with fewer wide ports. Building networks using high-radix routers lowers cost and improves perform...
 
Technology-Driven, Highly-Scalable Dragonfly Topology
Found in: Computer Architecture, International Symposium on
By John Kim, Wiliam J. Dally, Steve Scott, Dennis Abts
Issue Date:June 2008
pp. 77-88
Evolving technology and increasing pin-bandwidth motivate the use of high-radix routers to reduce the diameter, latency, and cost of interconnection networks. High-radix networks, however, require longer cables than their low-radix counterparts. Because ca...
 
Mutually Aware Prefetcher and On-Chip Network Designs for Multi-Cores
Found in: IEEE Transactions on Computers
By Junghoon Lee,Hanjoon Kim,Minjeong Shin,John Kim,Jaehyuk Huh
Issue Date:September 2014
pp. 2316-2329
Hardware prefetching has become an essential technique in high performance processors to hide long external memory latencies. In multi-core architectures with cores communicating through a shared on-chip network, traffic generated by the prefetchers can ac...
 
Network within a network approach to create a scalable high-radix router microarchitecture
Found in: High-Performance Computer Architecture, International Symposium on
By Jung Ho Ahn,Sungwoo Choo,John Kim
Issue Date:February 2012
pp. 1-12
Cost-efficient networks are critical in creating scalable large-scale systems, including those found in supercomputers and datacenters. High-radix routers reduce network cost by lowering the network diameter while providing a high bisection bandwidth and p...
 
Leveraging torus topology with deadlock recovery for cost-efficient on-chip network
Found in: Computer Design, International Conference on
By Minjeong Shin,John Kim
Issue Date:October 2011
pp. 25-30
On-chip networks are becoming more important as the number of on-chip components continue to increase. 2D mesh topology is a commonly assumed topology for on-chip networks but in this work, we make the argument that 2D torus can provide a more cost-efficie...
 
Probabilistic Distance-Based Arbitration: Providing Equality of Service for Many-Core CMPs
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Michael M. Lee, John Kim, Dennis Abts, Michael Marty, Jae W. Lee
Issue Date:December 2010
pp. 509-519
Emerging many-core chip multiprocessors will integrate dozens of small processing cores with an on-chip interconnect consisting of point-to-point links. The interconnect enables the processing cores to not only communicate, but to share common resources su...
 
Throughput-Effective On-Chip Networks for Manycore Accelerators
Found in: Microarchitecture, IEEE/ACM International Symposium on
By Ali Bakhoda, John Kim, Tor M. Aamodt
Issue Date:December 2010
pp. 421-432
As the number of cores and threads in many core compute accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network design. This paper explores throughput-effective network-on-chips (NoC) for fu...
 
Exploring concentration and channel slicing in on-chip network router
Found in: Networks-on-Chip, International Symposium on
By Prabhat Kumar, Yan Pan, John Kim, Gokhan Memik, Alok Choudhary
Issue Date:May 2009
pp. 276-285
Sharing on-chip network resources efficiently is critical in the design of a cost-efficient network on-chip (NoC). Concentration has been proposed for on-chip networks but the trade-off in concentration implementation and performance has not been well unde...
 
Flattened Butterfly Topology for On-Chip Networks
Found in: Microarchitecture, IEEE/ACM International Symposium on
By John Kim, James Balfour, William Dally
Issue Date:December 2007
pp. 172-182
With the trend towards increasing number of cores in chip multiprocessors, the on-chip interconnect that connects the cores needs to scale efficiently. In this work, we propose the use of high-radix networks in on-chip interconnection net- works and descri...
 
Flattened Butterfly Topology for On-Chip Networks
Found in: IEEE Computer Architecture Letters
By John Kim, James Balfour, William J. Dally
Issue Date:July 2007
pp. 37-40
With the trend towards increasing number of cores in a multicore processors, the on-chip network that connects the cores needs to scale efficiently. In this work, we propose the use of high-radix networks in on-chip networks and describe how the flattened ...
 
Integrated Information Systems for Highway Safety Management: Conceptual Design for Interoperability
Found in: Multimedia and Ubiquitous Engineering, International Conference on
By Seongkwan Mark Lee, Sung-Gheel Jang, Tschangho John Kim, Seunglim Kang
Issue Date:April 2007
pp. 518-523
<p>The Korea National Police Agency, Korea Highway Corporation, Korea Regional Construction Agencies and the Seoul Metropolitan Government have developed their own highway and traffic safety information management systems. The critical problem is tha...
 
Adaptive Routing in High-Radix Clos Network
Found in: SC Conference
By John Kim, William J. Dally, J. Dally, Dennis Abts
Issue Date:November 2006
pp. 7
Recent increases in the pin bandwidth of integratedcircuits has motivated an increase in the degree or radix of interconnection network routers. The folded-Clos network can take advantage of these high-radix routers and this paper investigates adaptive rou...
 
The BlackWidow High-Radix Clos Network
Found in: Computer Architecture, International Symposium on
By Steve Scott, Dennis Abts, John Kim, William J. Dally
Issue Date:June 2006
pp. 16-28
This paper describes the radix-64 folded-Clos network of the Cray BlackWidow scalable vector multiprocessor. We describe the BlackWidow network which scales to 32K processors with a worstcase diameter of seven hops, and the underlying high-radix router mic...
 
Microarchitecture of a High-Radix Router
Found in: Computer Architecture, International Symposium on
By John Kim, William J. Dally, Brian Towles, Amit K. Gupta
Issue Date:June 2005
pp. 420-431
Evolving semiconductor and circuit technology has greatly increased the pin bandwidth available to a router chip. In the early 90s, routers were limited to 10Gb/s of pin bandwidth. Today 1Tb/s is feasible, and we expect 20Tb/s of I/O bandwidth by 2010. A h...
 
Active Drag Reduction using Neural Networks
Found in: Neural Networks for Identification, Control, and Robotics, International Workshop
By David Babcock, Bhusan Gupta, Rodney Goodman, Changhoon Lee, John Kim
Issue Date:August 1996
pp. 0279
Abstract: This paper presents the application of a neural network controller to the problem of active drag reduction in a fully turbulent 3D fluid flow regime. The neural network learns a function nearly identical to an analytically derived control law. We...
 
Router microarchitecture and scalability of ring topology in on-chip networks
Found in: Proceedings of the 2nd International Workshop on Network on Chip Architectures (NoCArc '09)
By Hanjoon Kim, John Kim
Issue Date:December 2009
pp. 5-10
On-chip networks are critical to the scaling of future multicore processors. Recent multicore processors have adopted ring topologies because of its simplicity and high bandwidth. In this paper, we first describe a bufferless router microarchitecture for a...
     
Innovative practices session 3C: Solving today's test challenges
Found in: 2014 IEEE 32nd VLSI Test Symposium (VTS)
By John Kim,Wolfgang Meyer,T. M. Mak,Amitava Majumdar
Issue Date:April 2014
pp. 1
Test vehicles are commonly used to understand the characteristics of a new process node. The ability to precisely identify and isolate defects is a key requirement during yield learning on these vehicles. Efficiently utilizing the fanouts in a design is cr...
   
Improving GPGPU resource utilization through alternative thread block scheduling
Found in: 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)
By Minseok Lee,Seokwoo Song,Joosik Moon,John Kim,Woong Seo,Yeongon Cho,Soojung Ryu
Issue Date:February 2014
pp. 260-271
High performance in GPGPU workloads is obtained by maximizing parallelism and fully utilizing the available resources. The thousands of threads are assigned to each core in units of CTA (Cooperative Thread Arrays) or thread blocks - with each thread block ...
   
A detailed and flexible cycle-accurate Network-on-Chip simulator
Found in: 2013 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)
By Nan Jiang,James Balfour,Daniel U. Becker,Brian Towles,William J. Dally,George Michelogiannakis,John Kim
Issue Date:April 2013
pp. 86-96
Network-on-Chips (NoCs) are becoming integral parts of modern microprocessors as the number of cores and modules integrated on a single chip continues to increase. Research and development of future NoC technology relies on accurate modeling and simulation...
   
TalkBetter: family-driven mobile intervention care for children with language delay
Found in: Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing (CSCW '14)
By Chanyou Hwang, Chulhong Min, Chungkuk Yoo, Dongsun Yim, Inseok Hwang, John Kim, Junehwa Song, Youngki Lee
Issue Date:February 2014
pp. 1283-1296
Language delay is a developmental problem of children who do not acquire language as expected for their chronological ages. Without timely intervention, language delay can act as a lifelong risk factor. Speech-language pathologists highlight that effective...
     
Scalable high-radix router microarchitecture using a network switch organization
Found in: ACM Transactions on Architecture and Code Optimization (TACO)
By John Kim, Jung Ho Ahn, Young Hoon Son
Issue Date:September 2013
pp. 1-25
As the system size of supercomputers and datacenters increases, cost-efficient networks become critical in achieving good scalability on those systems. High-radix routers reduce network cost by lowering the network diameter while providing a high bisection...
     
Designing on-chip networks for throughput accelerators
Found in: ACM Transactions on Architecture and Code Optimization (TACO)
By Ali Bakhoda, John Kim, Tor M. Aamodt
Issue Date:September 2013
pp. 1-35
As the number of cores and threads in throughput accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network design. This article explores throughput-effective Network-on-Chips (NoC) for future ...
     
What makes users rate (share, tag, edit...)?: predicting patterns of participation in online communities
Found in: Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work (CSCW '12)
By John Kim, Loren Terveen, Mark Snyder, Paul Fugelstad, Cleila Anna Mannino, Jennifer Filson Moses, Patrick Dwyer
Issue Date:February 2012
pp. 969-978
Administrators of online communities face the crucial issue of understanding and developing their user communities. Will new users become committed members? What types of roles are particular individuals most likely to take on? We report on a study that in...
     
FeatherWeight: low-cost optical arbitration with QoS support
Found in: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-44 '11)
By John Kim, Gokhan Memik, Yan Pan
Issue Date:December 2011
pp. 105-116
The nanophotonic signaling technology enables efficient global communication and low-diameter networks such as crossbars that are often optically arbitrated. However, existing optical arbitration schemes incur costly overheads (e.g., waveguides, laser powe...
     
Approximating age-based arbitration in on-chip networks
Found in: Proceedings of the 19th international conference on Parallel architectures and compilation techniques (PACT '10)
By Dennis Abts, Jae W. Lee, John Kim, Michael M. Lee, Michael Marty
Issue Date:September 2010
pp. 575-576
The on-chip network of emerging many-core CMPs enables the sharing of numerous on-chip components. This on-chip network needs to ensure fairness when accessing the shared resources. In this work, we propose providing equality of service (EoS) in future man...
     
On-chip network design considerations for compute accelerators
Found in: Proceedings of the 19th international conference on Parallel architectures and compilation techniques (PACT '10)
By Ali Bakhoda, John Kim, Tor M. Aamodt
Issue Date:September 2010
pp. 535-536
There has been little work investigating the overall performance impact of on-chip communication in manycore compute accelerators. In this paper we evaluate performance of a GPU-like compute accelerator running CUDA workloads and consisting of compute node...
     
Low-cost router microarchitecture for on-chip networks
Found in: Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture (Micro-42)
By John Kim
Issue Date:December 2009
pp. 255-266
On-chip networks are critical to the scaling of future multi-core processors. The challenge for on-chip network is to reduce the cost including power consumption and area while providing high performance such as low latency and high bandwidth. Although muc...
     
Achieving predictable performance through better memory controller placement in many-core CMPs
Found in: Proceedings of the 36th annual international symposium on Computer architecture (ISCA '09)
By Dan Gibson, Dennis Abts, John Kim, Mikko H. Lipasti, Natalie D. Enright Jerger
Issue Date:June 2009
pp. 70-73
In the near term, Moore's law will continue to provide an increasing number of transistors and therefore an increasing number of on-chip cores. Limited pin bandwidth prevents the integration of a large number of memory controllers on-chip. With many cores,...
     
Firefly: illuminating future network-on-chip with nanophotonics
Found in: Proceedings of the 36th annual international symposium on Computer architecture (ISCA '09)
By Alok Choudhary, Gokhan Memik, John Kim, Prabhat Kumar, Yan Pan, Yu Zhang
Issue Date:June 2009
pp. 70-73
Future many-core processors will require high-performance yet energy-efficient on-chip networks to provide a communication substrate for the increasing number of cores. Recent advances in silicon nanophotonics create new opportunities for on-chip networks....
     
Indirect adaptive routing on large scale interconnection networks
Found in: Proceedings of the 36th annual international symposium on Computer architecture (ISCA '09)
By John Kim, Nan Jiang, William J. Dally
Issue Date:June 2009
pp. 70-73
Recently proposed high-radix interconnection networks [10] require global adaptive routing to achieve optimum performance. Existing direct adaptive routing methods are slow to sense congestion remote from the source router and hence misroute many packets b...
     
Flattened butterfly: a cost-efficient topology for high-radix networks
Found in: Proceedings of the 34th annual international symposium on Computer architecture (ISCA '07)
By Dennis Abts, John Kim, William J. Dally
Issue Date:June 2007
pp. 126-137
Increasing integrated-circuit pin bandwidth has motivateda corresponding increase in the degree or radix of interconnection networksand their routers. This paper introduces the flattened butterfly, a cost-efficient topology for high-radix networks. On beni...
     
 1