Wednesday, June 11th 2025

AMD Instinct MI355X Draws up to 1,400 Watts in OAM Form Factor

Tomorrow evening, AMD will host its "Advancing AI" livestream to introduce the Instinct MI350 series, a new line of GPU accelerators designed for large-scale AI training and inference. First shown in prototype form at ISC 2025 in Hamburg just a day ago, each MI350 card features 288 GB of HBM3E memory, delivering up to 8 TB/s of sustained bandwidth. Customers can choose between the single-card MI350X and the higher-clocked MI355X or opt for a full eight-GPU platform that aggregates to over 2.3 TB of memory. Both chips are built on the CDNA 4 architecture, which now supports four different precision formats: FP16, FP8, FP6, and FP4. The addition of FP6 and FP4 is designed to boost throughput in modern AI workloads, where models of tomorrow with tens of trillions of parameters are trained on FP6 and FP4.

In half-precision tests, the MI350X achieves 4.6 PetaFLOPS on its own and 36.8 PetaFLOPS in eight-GPU platform form, while the MI355X surpasses those numbers, reaching 5.03 PetaFLOPS and just over 40 PetaFLOPS. AMD is also aiming to improve energy efficiency by a factor of thirty compared with its previous generation. The MI350X card runs within a 1,000 Watt power envelope and relies on air cooling, whereas the MI355X steps up to 1,400 Watts and is intended for direct-liquid cooling setups. That 400 Watt increase puts it right at NVIDIA's upcoming GB300 "Grace Blackwell Ultra" superchip, which is also a 1,400 W design. With memory capacity, raw computing, and power efficiency all pushed to new heights, the question remains whether real-world benchmarks will match these ambitious specifications. AMD now only lacks platform scaling beyond eight GPUs, which the Instinct MI400 series will address.
Sources: ComputerBase, via VideoCardz
Add your own comment

10 Comments on AMD Instinct MI355X Draws up to 1,400 Watts in OAM Form Factor

#1
Daven
I believe this is AMD's second product line to use 3 nm. Turin dense Epyc chips are the first.
Posted on Reply
#2
Pizderko
Now imagine this same card, but instead made on GAA-FET lithography?

GAA-fet promise to bring down the power consumption at least by 32%, so this means that 1400 watts - 32% = 952watts, thus 448watt difference.

Yeh... I'm such mad that's gaa-fet is not yet here.
Posted on Reply
#3
tsunami2311
PizderkoNow imagine this same card, but instead made on GAA-FET lithography?

GAA-fet promise to bring down the power consumption at least by 32%, so this means that 1400 watts - 32% = 952watts, thus 448watt difference.

Yeh... I'm such mad that's gaa-fet is not yet here.
it wont mater in the end they just use saving as excused to put back up gain squeeze more performance out of it.
Posted on Reply
#4
Pizderko
I want to see if Intel will beat Apple silicon like m5 or m6 with Panther Lake on gaa-fet.
If not, then I'm going to buy for my self a mac book pro & Hasta la vista PC & everything related with x86-64.
Posted on Reply
#5
zo0lykas
PizderkoNow imagine this same card, but instead made on GAA-FET lithography?

GAA-fet promise to bring down the power consumption at least by 32%, so this means that 1400 watts - 32% = 952watts, thus 448watt difference.

Yeh... I'm such mad that's gaa-fet is not yet here.
Not really, cuz most power going to rams, not to gpu-die
Posted on Reply
#6
Dr. Dro
zo0lykasNot really, cuz most power going to rams, not to gpu-die
This isn't really the case, though. HBM really does not consume that much power, even at large scale deployment, and HBM3E scales to 48 GB per stack. These MI355Xs have 288 GB capacity, which means 6 stacks attached per GPU. If I had to guess, about 1.2 to 1.3 kW per GPU core, though with an insanely high flops/W rating.

What pretty much everyone doesn't understand is that these are exascale computing parts and them having such high wattage ratings won't reflect on the desktop PCIe GPUs.
Posted on Reply
#7
igormp
Those 1400W are a bit above OAM's official power spec (1000W). I wonder if they are still using 44~60V lines or if they have ramped those up as well.
I assume AMD will help the consortium to come up with a new revision in order to standardize those higher TDPs.
Also:
The AMD GPU Operator simplifies deployment and management of AMD Instinct™ GPUs in Kubernetes clusters, helping enable effortless configuration of GPU-accelerated workloads like machine learning and Generative Al. This streamlines operations and accelerates time to market
Nice joke lol
Posted on Reply
#8
Daven
Dr. DroThis isn't really the case, though. HBM really does not consume that much power, even at large scale deployment, and HBM3E scales to 48 GB per stack. These MI355Xs have 288 GB capacity, which means 6 stacks attached per GPU. If I had to guess, about 1.2 to 1.3 kW per GPU core, though with an insanely high flops/W rating.

What pretty much everyone doesn't understand is that these are exascale computing parts and them having such high wattage ratings won't reflect on the desktop PCIe GPUs.
I think the high wattage ratings are due to multiple GPU chiplets/tiles being placed on the package. The MI300x chip has basically four complete 80 CU GPUs on one package. If we multiple the 80 CU 7900GRE with 260W by four, you get well over the 750W of the MI300x. I'm guessing that the MI355x will have four 300W GPUs on one package with HBM making up the remaining 200W for 1400W total.
Posted on Reply
#9
lambda
DavenI think the high wattage ratings are due to multiple GPU chiplets/tiles being placed on the package. The MI300x chip has basically four complete 80 CU GPUs on one package. If we multiple the 80 CU 7900GRE with 260W by four, you get well over the 750W of the MI300x. I'm guessing that the MI355x will have four 300W GPUs on one package with HBM making up the remaining 200W for 1400W total.
The MI stuff is CDNA unlike 7900 which is RDNA but I guess some characteristics match.
Posted on Reply
#10
Dr. Dro
DavenI think the high wattage ratings are due to multiple GPU chiplets/tiles being placed on the package. The MI300x chip has basically four complete 80 CU GPUs on one package. If we multiple the 80 CU 7900GRE with 260W by four, you get well over the 750W of the MI300x. I'm guessing that the MI355x will have four 300W GPUs on one package with HBM making up the remaining 200W for 1400W total.
It's not that. These 1400 W are per GPU OAM (module), not per board. Each platform cluster has 8 of these, so they chug ~11200 W per group. Remember the 800 W+ Nvidia accelerators that everyone was losing their minds over a couple of years ago? This is the exact same thing, the TDPs are being significantly increased, but the overall processing efficiency on a performance per watt perspective is also going significantly up alongside it. The Tom's article explains it well, but to tl;dr these are basically the baby steps towards zettascale computing.

www.tomshardware.com/pc-components/gpus/amds-instinct-mi355x-accelerator-will-consume-1-400-watts

Each MI355X is rated for ~5 PFLOPS FP16, I know the flops metric is a tired old trope but it's a decent enough account of hardware's raw theoretical performance, and this is a mindblowing number. It's several times faster than something like a RTX 5090. Speaking of which, a 600 W 5090 may sound crazy to most people, but if you look at the raw numbers, you will find the exact same thing going on: the math won't lie ;)

All that AMD is missing is a solid, performant runtime to take this market by storm, they have the hardware already.
Posted on Reply
Jul 9th, 2025 19:57 CDT change timezone

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts