• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Confirms that Instinct MI300X GPU Can Consume 750 W

Status
Not open for further replies.

T0@st

News Editor
Joined
Mar 7, 2023
Messages
3,328 (3.87/day)
Location
South East, UK
System Name The TPU Typewriter
Processor AMD Ryzen 5 5600 (non-X)
Motherboard GIGABYTE B550M DS3H Micro ATX
Cooling DeepCool AS500
Memory Kingston Fury Renegade RGB 32 GB (2 x 16 GB) DDR4-3600 CL16
Video Card(s) PowerColor Radeon RX 7800 XT 16 GB Hellhound OC
Storage Samsung 980 Pro 1 TB M.2-2280 PCIe 4.0 X4 NVME SSD
Display(s) Lenovo Legion Y27q-20 27" QHD IPS monitor
Case GameMax Spark M-ATX (re-badged Jonsbo D30)
Audio Device(s) FiiO K7 Desktop DAC/Amp + Philips Fidelio X3 headphones, or ARTTI T10 Planar IEMs
Power Supply ADATA XPG CORE Reactor 650 W 80+ Gold ATX
Mouse Roccat Kone Pro Air
Keyboard Cooler Master MasterKeys Pro L
Software Windows 10 64-bit Home Edition
AMD recently revealed its Instinct MI300X GPU at their Data Center and AI Technology Premiere event on Tuesday (June 15). The keynote presentation did not provide any details about the new accelerator model's power consumption, but that did not stop one tipster - Hoang Anh Phu - from obtaining this information from Team Red's post-event footnotes. A comparative observation was made: "MI300X (192 GB HBM3, OAM Module) TBP is 750 W, compared to last gen, MI250X TBP is only 500-560 W." A leaked Giga Computing roadmap from last month anticipated server-grade GPUs hitting the 700 W mark.

NVIDIA's Hopper H100 took the crown - with its demand for a maximum of 700 W - as the most power-hungry data center enterprise GPU until now. The MI300X's OCP Accelerator Module-based design now surpasses Team Green's flagship with a slightly greater rating. AMD's new "leadership generative AI accelerator" sports 304 CDNA 3 compute units, which is a clear upgrade over the MI250X's 220 (CDNA 2) CUs. Engineers have also introduced new 24G B HBM3 stacks, so the MI300X can be specced with 192 GB of memory (as a maximum), the MI250X is limited to a 128 GB memory capacity with its slower HBM2E stacks. We hope to see sample units producing benchmark results very soon, with the MI300X pitted against H100.



View at TechPowerUp Main Site | Source
 
192GB HBM3, future is here :)
 
On paper this trumps the H100, how is the battle in real workloads?
 
what cpu's save in powerconsumption, gpu's will make up for....
 
is Microsoft still putting servers in the sea/lakes to keep them cold without exorbitant power costs? I sure hope so.

Pretty amazing. They remove the air and basically create a vaccuum.. The cold ocean water is like the ambient temp for the container.
 
This is an immense feat of engineering and hopefully something that helps AMD garner well-deserved market share but issues abound:

1 - HBM3 is costly, and may negate the cost advantage AMD might have with the MI300. Nvidia is likely to ship with HBM3 products at the same timeframe or earlier
2 - There is no apparent equivalent to the transformer engine, which can triple performance in LLM scenarios in Nvidia counterparts
3 - Nvidias H100 is shipping in full volume today, with more researcher and technical support in their superior ecosystem
4 - AMD is yet to disclose benchmarks

I hope AMD can resolve or alleviate some of these issues because it seems like an excellent product overall.
 
Uf that's a lot of power going throw a pcb and gpu.

And I think the 570 watt peak I have seen on my rtx 4090 with oc is bad enough. Yes I am aware this is meant for other things than gaming. But still 700 watt or more is a lot of power for one gpu.
 
Uf that's a lot of power going throw a pcb and gpu.

And I think the 570 watt peak I have seen on my rtx 4090 with oc is bad enough. Yes I am aware this is meant for other things than gaming. But still 700 watt or more is a lot of power for one gpu.
This monster has about 5 to 10 times the transistors vs a 4090 and 192GB of HBM is no joke but it is actual not bad considering.
 
Depends on the software stack, this alongside MI300A will beat nearly any combinations of AI/CPU/GPU power Nvidia can muster but then again there's CUDA so!
Entire GPU ecosystem is based around software, the hardware is just a checkbox.

NVIDIA is a software company.
 
Uf that's a lot of power going throw a pcb and gpu.

And I think the 570 watt peak I have seen on my rtx 4090 with oc is bad enough. Yes I am aware this is meant for other things than gaming. But still 700 watt or more is a lot of power for one gpu.
If the trend continues, those accelerators are going to consume one half-Xeon in a couple years.
 
Entire GPU ecosystem is based around software, the hardware is just a checkbox.

NVIDIA is a software company.
Well based on the Adrenline software vs Nvidia's offerings would belie what you are saying. Ray Tracing and DLSS do not apply in this space so.
 
Well based on the Adrenline software vs Nvidia's offerings would belie what you are saying. Ray Tracing and DLSS do not apply in this space so.
Maybe by the time AMD has figured out drivers that can be installed on any of their consumer cards (see: several months gap between RDNA3 release and RDNA2 driver update, or, more recently exclusive driver for RX7600 only), they can start to figure out the software and partner support they need to succeed in datacentre GPUs.

By then NVIDIA will have built it's 1:1 virtual/real world model in the Omniverse, of which every major manufacturer has already signed onto, just as with CUDA dominance for the past decade.

From 15 months 3/22 ago this image.

1686860393446.png


https://www.nvidia.com/en-us/omniverse/ecosystem/ current ecosystem.

Where is AMD?

AMD is getting there, if they can string a few consistent wins in a row they should be able to challenge Nvidia kinda like they did with Intel.
I hope so, a monopoly isn't a great situation, but the product needs to deliver. Zen did, but that was a hardware success not a software success. The software (AGESA) for Zen has been buggy in each iteration for years now, which they hope to fix eventually with opensource OpenSIL.
 
Maybe by the time AMD has figured out drivers that can be installed on any of their consumer cards (see: several months gap between RDNA3 release and RDNA2 driver update, or, more recently exclusive driver for RX7600 only), they can start to figure out the software and partner support they need to succeed in datacentre GPUs.

By then NVIDIA will have built it's 1:1 virtual/real world model in the Omniverse, of which every major manufacturer has already signed onto, just as with CUDA dominance for the past decade.

From 15 months 3/22 ago this image.

View attachment 301036

https://www.nvidia.com/en-us/omniverse/ecosystem/ current ecosystem.

Where is AMD?
Let me see I have had AMD since the original 6800 and yes that was a Gigabyte board and it was a gremlin. Then I went with Sapphire and nothing sense. Did you know that Sapphire had upscaling in their software package before FSR or DLSS were even conversation pieces? I could paste the AMD stack as well and as much as people complained about the 3 months that AMD did not give them driver updates for Cards that were working fine is also nothing but the narrative. There is also that while you and I can debate it that AMD will be selling plenty of these as the Companies on that placard are in some ways optimizing for AMD or depending on AWS and Microsoft for their network but we will go on. This card is no joke at all so as much as you give little import to the hardware 192GB of HBM3 (spec that we don't know) is nothing to sneeze at and the amount of transistor counts is also crazy.
 
Let me see I have had AMD since the original 6800 and yes that was a Gigabyte board and it was a gremlin. Then I went with Sapphire and nothing sense. Did you know that Sapphire had upscaling in their software package before FSR or DLSS were even conversation pieces? I could paste the AMD stack as well and as much as people complained about the 3 months that AMD did not give them driver updates for Cards that were working fine is also nothing but the narrative. There is also that while you and I can debate it that AMD will be selling plenty of these as the Companies on that placard are in some ways optimizing for AMD or depending on AWS and Microsoft for their network but we will go on. This card is no joke at all so as much as you give little import to the hardware 192GB of HBM3 (spec that we don't know) is nothing to sneeze at and the amount of transistor counts is also crazy.
Hardware is nice, sure.

"Narrative" aside, we'll see how they do in 2023 for enterprise GPU, in 2021 they couldn't breach 9% and the trend isn't changing, that percentage went down in 2022.

1686861557809.png
 
Hardware is nice, sure.

"Narrative" aside, we'll see how they do in 2023 for enterprise GPU, in 2021 they couldn't breach 9% and the trend isn't changing, that percentage went down in 2022.

View attachment 301037
Why do you think they released this card? I am aware of AMD's position in the Data Centre when it comes to GPUs. My argument was strictly software and again this is a monster of a GPU that could disrupt the stack.
 
Maybe by the time AMD has figured out drivers that can be installed on any of their consumer cards (see: several months gap between RDNA3 release and RDNA2 driver update, or, more recently exclusive driver for RX7600 only), they can start to figure out the software and partner support they need to succeed in datacentre GPUs.

By then NVIDIA will have built it's 1:1 virtual/real world model in the Omniverse, of which every major manufacturer has already signed onto, just as with CUDA dominance for the past decade.

From 15 months 3/22 ago this image.

View attachment 301036

https://www.nvidia.com/en-us/omniverse/ecosystem/ current ecosystem.

Where is AMD?


I hope so, a monopoly isn't a great situation, but the product needs to deliver. Zen did, but that was a hardware success not a software success. The software (AGESA) for Zen has been buggy in each iteration for years now, which they hope to fix eventually with opensource OpenSIL.
My word.

AMD drivers again? I have no issues and don't/have not seen the masses abnormally arrayed against AMD drivers, just the odd hyperbolic statement.
 
Last edited by a moderator:
Pretty amazing. They remove the air and basically create a vaccuum.. The cold ocean water is like the ambient temp for the container.
Yes, but you can't cheat physics. Imagine hundreds of those warming up local sea and disrupting flora and fauna.
 
Yes & it's not like whatever they're putting them in wouldn't cause at least some sort of (chemical) reaction at the depths used for them. We're just dumping more of our problems in the oceans if it's not plastic it's excess heat :shadedshu:

If you really want to make them almost completely eco friendly just launch them into space!
 
Yes & it's not like whatever they're putting them in wouldn't cause at least some sort of (chemical) reaction at the depths used for them. We're just dumping more of our problems in the oceans if it's not plastic it's excess heat :shadedshu:

If you really want to make them almost completely eco friendly just launch them into space!
Cooler chips consume less energy, due to lower voltage leakage.

If they're going to be run, they may as well be run more efficiently.
 
But they are disrupting the ecology of that place, even if a little bit. I'd rather them use the power from solar panels on a satellite & irradiate whatever heat they're generating mostly outside our atmosphere. While this sounded like a fun experiment I'm not sure how practical it is longer term. Especially with oceans getting warmer every day. We're just accelerating this process arguably more quickly with something like it, the data for instance they're crunching/processing would need to travel longer distances & so is it really more (energy) efficient overall?
 
But they are disrupting the ecology of that place, even if a little bit. I'd rather them use the power from solar panels on a satellite & irradiate whatever heat they're generating mostly outside our atmosphere. While this sounded like a fun experiment I'm not sure how practical it is longer term. Especially with oceans getting warmer every day. We're just accelerating this process arguably more quickly with something like it, the data for instance they're crunching/processing would need to travel longer distances & so is it really more (energy) efficient overall?
Heat from server farms won't even register compared to the effects of atmosphere changes and pollution from rivers, fuel oil container ships, manufacturing waste, farming run off, reduced reflective white from shrinking ice surface area etc. the list goes on. I doubt you could even calculate the difference submerged computers would make.

Average ocean temperature is 0-20ºC, even if that doubled (projections are a couple ºC increase over several centuries), it would still be an effective cooling medium.

I wouldn't be surprised to see local ecology find some way to benefit from the heat source, as with coral/bacteria ecosystems near underwater volcanos or vents etc.
 
Status
Not open for further replies.
Back
Top