• We've upgraded our forums. Please post any issues/requests in this thread.

Discussion on Heterogeneous System Architecture (HSA)

Joined
Jun 27, 2011
Messages
5,627 (2.38/day)
Likes
2,986
Processor Intel I7 4790k (stock)
Motherboard ASRock H97M-ITX/ac LGA 1150 Intel H97
Cooling Prolimatech megahalem
Memory Crucial 2x4gb 1600mhz
Video Card(s) EVGA 1060 3gb
Storage OWC Mercury SSD 240 GB
Display(s) Asus 144hz
Case Raijintek Metis
Power Supply Corsair SF600 600w psu
Software Windows 10 64 Bit
#1


Since their earliest days, computers have contained central processing units (CPUs) designed to run general programming tasks very well. But in the last couple of decades, mainstream computer systems typically include other processing elements as well. The most prevalent is the graphics processing unit (GPU), originally designed to perform specialized graphics computations in parallel. Over time, GPUs have become more powerful and more generalized, allowing them to be applied to general purpose parallel computing tasks with excellent power efficiency.

But current CPUs and GPUs have been designed as separate processing elements and do not work together efficiently…
Today, a growing number of mainstream applications require the high performance and power efficiency achievable only through such highly parallel computation. But current CPUs and GPUs have been designed as separate processing elements and do not work together efficiently – and are cumbersome to program. Each has a separate memory space, requiring an application to explicitly copy data from CPU to GPU and then back again.

A program running on the CPU queues work for the GPU using system calls through a device driver stack managed by a completely separate scheduler. This introduces significant dispatch latency, with overhead that makes the process worthwhile only when the application requires a very large amount of parallel computation. Further, if a program running on the GPU wants to directly generate work-items, either for itself or for the CPU, it is impossible today!

HSA creates an improved processor design that exposes the benefits and capabilities of mainstream programmable compute elements, working together seamlessly.
To fully exploit the capabilities of parallel execution units, it is essential for computer system designers to think differently. The designers must re-architect computer systems to tightly integrate the disparate compute elements on a platform into an evolved central processor while providing a programming path that does not require fundamental changes for software developers. This is the primary goal of the new HSA design.

HSA creates an improved processor design that exposes the benefits and capabilities of mainstream programmable compute elements, working together seamlessly. With HSA, applications can create data structures in a single unified address space and can initiate work items on the hardware most appropriate for a given task. Sharing data between compute elements is as simple as sending a pointer. Multiple compute tasks can work on the same coherent memory regions, utilizing barriers and atomic memory operations as needed to maintain data synchronization (just as multi-core CPUs do today).
Link.
AMD says that using the graphics core for the heavy scalar floating point will get as easy as C++ programming. Link.
The HSA foundation has companies like AMD, ARM, Imagination, MediaTek, Qualcomm, Samsung, and Texas Instruments as members backing it. Amd is pretty much at the help of this.Link to HSA Foundation website.

Amd is releasing kaveri and has their Berlin line of server processors coming out to take advantage of all this. As all of our TPU crunchers know, there is a lot of compute power sitting in gpu's. I find this technology exciting at is possibilities. I am eagerly looking forward to benchmarks and reviews showcasing what this technology can do. I feel I may be making this thread a little premature, but I would love to get a discussion going on HSA.