HSA Foundation: Unlocking the Performance and Power Efficiency of Parallel Computing Engines
by Dr. John Glossner, President, HSA Foundation

HSAWelcome to HSA Connections! On behalf of the Heterogeneous System Architecture (HSA) Foundation, I first want to thank IEEE for the opportunity to launch this blog. We appreciate being able to share HSA achievements with both the developer community and IEEE’s diverse and prestigious global audience.

Our goal is to allow developers to more easily and efficiently apply the hardware resources in today’s complex systems-on-chips (SoCs). This will enable applications to run faster and at lower power across a wide range of computing platforms spanning mobile devices, desktops, high-performance computing (HPC) systems and servers.

So let’s get started and drill down a bit.

Compute functions in today’s devices generally fall into a few categories: general purpose processing, graphics processing, and accelerators. In HSA terminology we term general purpose processors as “hosts” and compute processors as “agents”. Realizing the compute power necessary to improve the performance of these tasks has resulted in some of the most significant processor developments of our time, including multicore processors, graphics processors, and application specific processors such as DSP’s, Image Signal Processors (ISPs), FPGA’s, etc.

Still, as data rates accelerate, the ability to bring more compute functions and ultra-high definition graphics into consumer systems demands ever higher capabilities.

And it’s not always as simple as popping a new processor into the system.

Today’s general purpose CPUs and graphics processors deliver astronomical computation rates, some on the order of teraflops per second. With the addition of FPGA’s and application specific processors, even higher rates can be achieved.

Now, the bottlenecks are not in the processors, but in the management software that synchronizes compute functions between processors and other agents in the system.

Over the years, a small number of entities have responded with improved management algorithms, specially designed firmware that directs and clocks data and graphics computations on the fly and in parallel to improve overall efficiency. At issue is that most are portable only at the API level. The cross-vendor HSA specification improves upon that model by offering a virtual instruction set capable of being executed by any agent.

Another key challenge in heterogeneous computing is related to memory organization and the need to copy data structures between various local memories.

Now, the HSAF is making further contributions to heterogeneous systems by focusing on software that makes the programming of these systems common across all platforms.

The HSA specification (v1.0 specification, March 2015) addresses this problem while it targets towards royalty-free industry standards for heterogeneous computing.

The HSA specification considers virtual memory, memory coherency, architected dispatch mechanisms, and power-efficient signals. The architecture is designed to reduce the overhead and latency to dispatch work to the accelerator. The design allows targeting the accelerator hardware directly via high-level compilers and data parallel- and managed runtimes without the typical translation steps necessary to interface with a high-level API in the dispatch. The architecture also allows compute kernels running on the accelerator to efficiently call back to the host for OS services like file I/O, networking and similar functions that typically would not be available. This allows the accelerator to operate as a true peer of the host CPU.

To ensure that heterogeneous systems optimally work in accordance with the specification, the HSAF provides a set of conformance tests that can be used for HSA certification.

To further extend these systems, the HSAF is working on creating guidelines for incorporating IP from multiple vendors into the same SoC, and much more.

Last October, we previewed several of our members’ plans for supporting HSA in their next-generation products. Products from AMD, ARM, Imagination Technologies and MediaTek will be the world’s first that are based on HSA.

Finally, in early November, the Foundation announced the publication of Heterogeneous System Architecture: A New Compute Platform Infrastructure (1st Edition), edited by Dr. Wen-Mei Hwu. The book, published by Elsevier Publishing, offers a practical guide to understanding HSA so developers can take advantage of the performance and power efficiency of parallel computing engines.

It’s our sincere hope that these developments, and our new HSA Foundation blog, will help usher in a new era of heterogeneous computing for greater efficiency and performance in all consumer and enterprise technologies.