• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Announces the Radeon Instinct Family of Deep-Learning Accelerators

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
46,283 (7.69/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
AMD (NASDAQ: AMD) today unveiled its strategy to accelerate the machine intelligence era in server computing through a new suite of hardware
and open-source software offerings designed to dramatically increase performance, efficiency, and ease of implementation of deep learning workloads. New Radeon Instinct accelerators will offer organizations powerful GPU-based solutions for deep learning inference and training. Along with the new hardware offerings, AMD announced MIOpen, a free, open-source library for GPU accelerators intended to enable high-performance machine intelligence implementations, and new, optimized deep learning frameworks on AMD's ROCm software to build the foundation of the next evolution of machine intelligence workloads.

Inexpensive high-capacity storage, an abundance of sensor driven data, and the exponential growth of user-generated content are driving exabytes of data globally. Recent advances in machine intelligence algorithms mapped to high-performance GPUs are enabling orders of magnitude acceleration of the processing and understanding of that data, producing insights in near real time. Radeon Instinct is a blueprint for an open software ecosystem for machine intelligence, helping to speed inference insights and algorithm training.

"Radeon Instinct is set to dramatically advance the pace of machine intelligence through an approach built on high-performance GPU accelerators, and free, open-source software in MIOpen and ROCm," said AMD President and CEO, Dr. Lisa Su. "With the combination of our high-performance compute and graphics capabilities and the strength of our multi-generational roadmap, we are the only company with the GPU and x86 silicon expertise to address the broad needs of the datacenter and help advance the proliferation of machine intelligence."

At the AMD Technology Summit held last week, customers and partners from 1026 Labs, Inventec, SuperMicro, University of Toronto's CHIME radio telescope project and Xilinx praised the launch of Radeon Instinct, discussed how they're making use of AMD's machine intelligence and deep learning technologies today, and how they can benefit from Radeon Instinct.

Radeon Instinct accelerators feature passive cooling, AMD MultiGPU (MxGPU) hardware virtualization technology conforming with the SR-IOV (Single Root I/O Virtualization) industry standard, and 64-bit PCIe addressing with Large Base Address Register (BAR) support for multi-GPU peer-to-peer support.

Radeon Instinct accelerators are designed to address a wide-range of machine intelligence applications:
  • The Radeon Instinct MI6 accelerator based on the acclaimed Polaris GPU architecture will be a passively cooled inference accelerator optimized for jobs/second/Joule with 5.7 TFLOPS of peak FP16 performance at 150W board power and 16GB of GPU memory
  • The Radeon Instinct MI8 accelerator, harnessing the high-performance, energyefficient
  • "Fiji" Nano GPU, will be a small form factor HPC and inference accelerator with 8.2 TFLOPS of peak FP16 performance at less than 175W board power and 4GB of High-Bandwidth Memory (HBM)
  • The Radeon Instinct MI25 accelerator will use AMD's next-generation high performance Vega GPU architecture and is designed for deep learning training, optimized for time-to-solution
A variety of open source solutions are fueling Radeon Instinct hardware:
  • MIOpen GPU-accelerated library: To help solve high-performance machine intelligence implementations, the free, open-source MIOpen GPU-accelerated library is planned to be available in Q1 2017 to provide GPU-tuned implementations for standard routines such as convolution, pooling, activation functions, normalization and tensor format
  • ROCm deep learning frameworks: The ROCm platform is also now optimized for acceleration of popular deep learning frameworks, including Caffe, Torch 7 Tensorflow*, allowing programmers to focus on training neural networks rather than low-level performance tuning through ROCm's rich integrations. ROCm is intended to serve as the foundation of the next evolution of machine intelligence problem sets, with domain-specific compilers for linear algebra and tensors and an open compiler and language runtime
AMD is also investing in developing interconnect technologies that go beyond today's PCIe Gen3 standards to further performance for tomorrow's machine intelligence applications. AMD is collaborating on a number of open high-performance I/O standards that support broad ecosystem server CPU architectures including X86, OpenPOWER, and ARM AArch64. AMD is a founding member of CCIX, Gen-Z and OpenCAPI working towards a future 25 Gbit/s phienabled accelerator and rack-level interconnects for Radeon Instinct.

Radeon Instinct products are expected to ship in 1H 2017.

View at TechPowerUp Main Site
 
Joined
Dec 12, 2016
Messages
1,190 (0.45/day)
The specs in the article do not make sense:

MI6 - 150W so equivalent to RX480 and WX7100. However the Single precision (FP32) is listed as 2.85 TFLOPS (5.7/2 for FP16 Half precision). How can this be possible? That's half of the RX480 and WX7100 for the same power.

'Fiji' Nano GPU is the same thing. Listed at 8.2 TFLOPS of HALF (FP16) precision for 175W. That's 4.1 TFLOPS of single precision which is less than the RX480 and WX7100 at 150 and 130W, respectively.

Please confirm the specs again please.
 
Joined
Mar 7, 2013
Messages
31 (0.01/day)
Location
Australia
Processor AMD Ryzen 9 5900X
Motherboard ASUS Crosshair VIII Hero (WiFi)
Cooling EK-Nucleus AIO CR240
Memory G.Skill Trident Z Neo (F4-3600C16D-32GTZN) 3600MHz
Video Card(s) ASUS Strix 4090
Display(s) ASUS VG27AQL @ 144Hz, Acer XB271HU @ 144Hz
Case Corsair 4000D
Power Supply ROG Thor 1200W Platinum II
Mouse Logitech G703 Lightspeed w/ PowerPlay
Keyboard Logitech G915 TKL
Software Windows 11 Professional - X64
The specs in the article do not make sense:

MI6 - 150W so equivalent to RX480 and WX7100. However the Single precision (FP32) is listed as 2.85 TFLOPS (5.7/2 for FP16 Half precision). How can this be possible? That's half of the RX480 and WX7100 for the same power.

'Fiji' Nano GPU is the same thing. Listed at 8.2 TFLOPS of HALF (FP16) precision for 175W. That's 4.1 TFLOPS of single precision which is less than the RX480 and WX7100 at 150 and 130W, respectively.

Please confirm the specs again please.

Perhaps each FP32 core is simply running operations in FP16, without a specific design where Nvidia as the example, have FP32 cores than can run two FP16 operations at the same time if they're of the same type.

I imagine that unless they made specific chips for FP16, they'd have to use the existing design and its consumer just as much power no matter what precision its performing.
 
Top