Wednesday, June 17th 2020

AMD Confirms CDNA-Based Radeon Instinct MI100 Coming to HPC Workloads in 2H2020

Mark Papermaster, chief technology officer and executive vice president of Technology and Engineering at AMD, today confirmed that CDNA is on-track for release in 2H2020 for HPC computing. The confirmation was (adequately) given during Dell's EMC High-Performance Computing Online event. This confirms that AMD is looking at a busy 2nd half of the year, with both Zen 3, RDNA 2 and CDNA product lines being pushed to market.

CDNA is AMD's next push into the highly-lucrative HPC market, and will see the company differentiating their GPU architectures through market-based product differentiation. CDNA will see raster graphics hardware, display and multimedia engines, and other associated components being removed from the chip design in a bid to recoup die area for both increased processing units as well as fixed-function tensor compute hardware. CNDA-based Radeon Instinct MI100 will be fabricated under TSMC's 7 nm node, and will be the first AMD architecture featuring shared memory pools between CPUs and GPUs via the 2nd gen Infinity Fabric, which should bring about both throughput and power consumption improvements to the platform.
Sources: Hassn Mutjaba @ Twitter, via Videocardz
Add your own comment

13 Comments on AMD Confirms CDNA-Based Radeon Instinct MI100 Coming to HPC Workloads in 2H2020

#2
Fouquin
cucker tarlson
isn't cdna gcn based ?
Should be. Distilled down and rebuilt to amplify FP compute.
Posted on Reply
#3
Aldain
cucker tarlson
isn't cdna gcn based ?
no
Posted on Reply
#4
xkm1948
I pity whoever is gonna write software for these. OpenCL? Vulkan Compute?
Posted on Reply
#5
ARF
xkm1948
I pity whoever is gonna write software for these. OpenCL? Vulkan Compute?
:confused:

If Arcturus MI100 turns out to be a beast, I guess developers will fight between each other who to code for it...

Specs please ?
Posted on Reply
#6
xkm1948
ARF
:confused:

If Arcturus MI100 turns out to be a beast, I guess developers will fight between each other who to code for it...

Specs please ?
Doesn’t work like that. You need a full ecosystem of hw and sw for these gpu accelerated computing. Very few software developers and end users will use it, if it requires deep investment into close to metal level programming. CUDA is successful because Nvidia takes huge effort in polishing the low level software foundation, making it effortless for developers to work on without being crippled by weird driver bugs

OpenCL is pretty broken so far with ROCm. Not sure about Vulkan compute.

Hopefully they find some good use for these GPUs
Posted on Reply
#7
Aquinus
Resident Wat-man
xkm1948
CUDA is successful because Nvidia takes huge effort in polishing the low level software foundation, making it effortless for developers to work on without being crippled by weird driver bugs
Would you say that's mostly due to poor documentation on AMD's part?
Posted on Reply
#8
xkm1948
Aquinus
Would you say that's mostly due to poor documentation on AMD's part?
Generally lack of investment on software side
Posted on Reply
#9
rvalencia
cucker tarlson
isn't cdna gcn based ?
GCN has inferior branch and instruction retirement latency performance when compared to RDNA.
Posted on Reply
#10
Cheeseball
Not a Potato
Aquinus
Would you say that's mostly due to poor documentation on AMD's part?
It's more of lack of investment, but I believe this because they're a smaller company. For example, while AMD does send sales engineers over to promote their products (which we use in the PDL and HCII), we don't get as much support from them compared to NVIDIA, who does send channel reps (basically NVIDIA's developers) to assist with some projects.

AMD needs to invest more time and money into supporting ROCm (and OpenCL).
Posted on Reply
#11
cucker tarlson
rvalencia
GCN has inferior branch and instruction retirement latency performance when compared to RDNA.
but better compute numbers
Posted on Reply
#12
1d10t
IMO in workstation environtment, Radeon Pro can hold on its own. Same as dekstop counterpart, while not giving highest performance but its offer best bang for bucks. Theres also a whole lot community support out there giving patches or just workaround. AMD should give major effort more than just a framework and community support for cDNA to really take off.
Posted on Reply
#13
rvalencia
cucker tarlson
but better compute numbers
Per CU count, FLOPS is the same. On software raytracing like Crytek's raytracing demo via compute, NAVI 10 beats VII

Both RDNA and GCN executes wave64 compute.

Read Amd/comments/ctfbemFigure 3 (bottom of page 5) shows 4 lines of shader instructions being executed in GCN, vs RDNA in Wave32 or “backwards compatible” Wave64.
Vega takes 12 cycles to complete the instruction on a GCN SIMD. Navi in Wave32 (optimized code) completes it in 7 cycles.
In backwards compatible (optimized for GCN Wave64) mode, Navi completes it in 8 cycles.
So even on code optimized for GCN, Navi is faster., but more performance can be extracted by optimizing for Navi.
Lower latency, and no wasted clock cycles.


GCN such as "Vega 20" supports 64bit FP.

RDNA still executes GCN instruction set with less latency.
Posted on Reply
Add your own comment