Tuesday, August 13th 2019

110°C Hotspot Temps "Expected and Within Spec", AMD on RX 5700-Series Thermals

AMD this Monday in a blog post demystified the boosting algorithm and thermal management of its new Radeon RX 5700 series "Navi" graphics cards. These cards are beginning to be available in custom-designs by AMD's board partners, but were only available as reference-design cards for over a month since their 7th July launch. The thermal management of these cards spooked many early adopters accustomed to seeing temperatures below 85 °C on competing NVIDIA graphics cards, with the Radeon RX 5700 XT posting GPU "hotspot" temperatures well above 100 °C, regularly hitting 110 °C, and sometimes even touching 113 °C with stress-testing application such as Furmark. In its blog post, AMD stated that 110 °C hotspot temperatures under "typical gaming usage" are "expected and within spec."

AMD also elaborated on what constitutes "GPU Hotspot" aka "junction temperature." Apparently, the "Navi 10" GPU is peppered with an array of temperature sensors spread across the die at different physical locations. The maximum temperature reported by any of those sensors becomes the Hotspot. In that sense, Hotspot isn't a fixed location in the GPU. Legacy "GPU temperature" measurements on past generations of AMD GPUs relied on a thermal diode at a fixed location on the GPU die which AMD predicted would become the hottest under load. Over the generations, and starting with "Polaris" and "Vega," AMD leaned toward an approach of picking the hottest temperature value from a network of diodes spread across the GPU, and reporting it as the Hotspot.
On Hotspot, AMD writes: "Paired with this array of sensors is the ability to identify the 'hotspot' across the GPU die. Instead of setting a conservative, 'worst case' throttling temperature for the entire die, the Radeon RX 5700 series GPUs will continue to opportunistically and aggressively ramp clocks until any one of the many available sensors hits the 'hotspot' or 'Junction' temperature of 110 degrees Celsius. Operating at up to 110C Junction Temperature during typical gaming usage is expected and within spec. This enables the Radeon RX 5700 series GPUs to offer much higher performance and clocks out of the box, while maintaining acoustic and reliability targets."

AMD also commented on the significantly increased granularity of clock-speeds that improves the GPU's power-management. The company transisioned from fixed DPM states to a highly fine-grained clock-speed management system that takes into account load, temperatures, and power to push out the highest possible clock-speeds for each component. "Starting with the AMD Radeon VII, and further optimized and refined with the Radeon RX 5700 series GPUs, AMD has implemented a much more granular 'fine grain DPM' mechanism vs. the fixed, discrete DPM states on previous Radeon RX GPUs. Instead of the small number of fixed DPM states, the Radeon RX 5700 series GPU have hundreds of Vf 'states' between the bookends of the idle clock and the theoretical 'Fmax' frequency defined for each GPU SKU. This more granular and responsive approach to managing GPU Vf states is further paired with a more sophisticated Adaptive Voltage Frequency Scaling (AVFS) architecture on the Radeon RX 5700 series GPUs," the blog post reads. Source: AMD
Add your own comment

141 Comments on 110°C Hotspot Temps "Expected and Within Spec", AMD on RX 5700-Series Thermals

#51
notb
R0H1T, post: 4097435, member: 131092"
Well then as consumers, not just you per se, how about supporting AMD with more $ especially when they release a competitive (perf/$) GPU?
LOL. And what next? AMD GPUs on Kickstarter?

How about AMD makes attractive, complete products - not just in benchmarks, but also in real life (quiet, cool, easy to setup, tinker-free and well supported by OEMs)?
Maybe then they'll be able to sell more?

They're making products aimed at enthusiasts - willingly focusing on a group that is more enticed to pay "200~400$ more for 5~25% performance". I mean: how much do people on this forum spend on OC? :-)

If AMD lacks money on polishing their GPUs, they can do an FPO like every normal listed company would do. :-)
Posted on Reply
#52
lynx29
Axaion, post: 4097508, member: 74362"
Yeah no thanks AMD, i dont wish to have hearing damage because of your poor cooler design
or you could buy a two-three fan design one, they just released this week and only cost $20 more... but mmk
Posted on Reply
#53
R0H1T
notb, post: 4097512, member: 165619"
LOL. And what next? AMD GPUs on Kickstarter?

How about AMD makes attractive, complete products - not just in benchmarks, but also in real life (quiet, cool, easy to setup, tinker-free and well supported by OEMs)?
Maybe then they'll be able to sell more?

They're making products aimed at enthusiasts - willingly focusing on a group that is more enticed to pay "200~400$ more for 5~25% performance". I mean: how much do people on this forum spend on OC? :)

If AMD lacks money on polishing their GPUs, they can do an FPO like every normal listed company would do. :)
What BS, you're making it sound like AMD GPUs are unusable garbage & Nvidia not only outstrips it across the board but also in every price bracket, every game you can think of! Which is of course BS as well :rolleyes:
Posted on Reply
#54
IceShroom
er557, post: 4097388, member: 90273"
Radeons have always ran hot, but this is ludicrous.
Nvidia cards run so cool that those needs 3 slots and 3 fan cooler to cool.
Posted on Reply
#55
laszlo
at least amd admit the hotspot ; for sure nvidia has also but keep waiting for telling you...
Posted on Reply
#56
TheinsanegamerN
Microsoft also claimed that the 95C temps the xbox 360 reached were perfectly normal and nothing to worry about.....right up until the hardware started dropping like flies.

Sorry, but just because the max temp of the silicon may be 110C does NOT mean it should reach that normally. This would be like if I drove my car at 155 MPH every single day with the heat pegged out. AMD is just making excuses for their ludicrously junk cooler design. Reaching such high tempts then cooling off when not gaming is going to prematurely wear out these chips, especially their solder connections.
Posted on Reply
#57
er557
IceShroom, post: 4097521, member: 175457"
Nvidia cards run so cool that those needs 3 slots and 3 fan cooler to cool.
no they dont, they run fine with blower, it's aib design for three fans for overclockability and higher-end cooling
Posted on Reply
#58
randomUser
My HD4850 reference design GPU ran 90C at idle and 110C when gaming. I don't think that was a silicon temp tho, might be tCase.
So silicone 120-130C?
Posted on Reply
#59
lynx29
er557, post: 4097526, member: 90273"
no they dont, they run fine with blower, it's aib design for three fans for overclockability and higher-end cooling
Nvidia's blower was much better designed, vapor chamber, etc. AMD really should have not done a blower launch, I think internally they know this and probably won't make same mistake with 5800 XT, but eh, who knows.
Posted on Reply
#60
londiste
On one hand, there are hotspots on GPUs and exposing that reading for monitoring externally is definitely a good thing. I do not doubt for a second that Nvidia has similar sensor readings internally available, just not exposed.

On the other hand, 110°C being expected and in spec is a suspicious statement because we know these GPUs throttle at that exact 110°C point.
It is like saying Ryzen 3000 running at 95°C is expected and in spec. It is technically correct...
Posted on Reply
#61
jmcosta
This reminds me of the GTX480 but at least Nvidia put some effort making a decent cooling solution

"This temperature is fine" but fan noise and throttling isn't...

and Im aware that AMD partners have fixed this issue, unfortunally they come a little late, a month late.
Posted on Reply
#62
lynx29
jmcosta, post: 4097541, member: 149479"
This reminds me of the GTX480 but at least Nvidia put some effort making a decent cooling solution

"This temperature is fine" but fan noise and throttling isn't...

and Im aware that AMD partners have fixed this issue, unfortunally they come a little late, a month late.
with a slightly higher fan curve above stock fan curve, blower fans do just fine on temps. this is just stock blower fan. which yeah most users won't run a custom fan, but I always have, even with nvidia. /shrug

let's just hope AMD learned their lesson finally and do better coolers for 5800 xt
Posted on Reply
#63
Zubasa
jmcosta, post: 4097541, member: 149479"
"This temperature is fine" but fan noise and throttling isn't...
I can understand the fan noise argument.
But where is your evidence of the card actually throttling?
Because if the reference design is really throttling and not boosting to full potential, the Sapphire Pulse wouldn't perform only marginally better even with a factory overclock.
Posted on Reply
#64
las
Dave65, post: 4097490, member: 82235"
It's always, a friend:roll:
399 is MSRP... :kookoo:
Posted on Reply
#65
Vya Domus
TheinsanegamerN, post: 4097524, member: 127292"
Sorry, but just because the max temp of the silicon may be 110C does NOT mean it should reach that normally.
I am amazed by these claims, how the hell do you know that ? What is normal and how do you know that's supposed to be normal ? Are you by any chance working on chip design and know this stuff better than we or AMD ?
Posted on Reply
#66
lynx29
Vya Domus, post: 4097575, member: 169281"
I am amazed by these claims, how the hell do you know that ? What is normal and how do you know that's supposed to be normal ? Are you by any chances working on chip design and know this stuff better than we or AMD ?
I think a lot of people are taking this out of context possibly, this isn't the GPU temp folks, similar to how a lot of cheap motherboards CPU's you can get the CPU good temps, but there will be a hotspot on VRM somewhere at 92 celsius and not a huge deal if not overclocking. I think this is similar, gpu will never itself get that hot, it's just a hotspot of a spefici part that is normally always a bit hotter than the gpu core.

At least that is my line of thought anyway. Still glad I got the 3 fan gigabyte version for only $20 more though :D
Posted on Reply
#67
IceShroom
er557, post: 4097526, member: 90273"
no they dont, they run fine with blower, it's aib design for three fans for overclockability and higher-end cooling
Nvidia blower card like GTX 1080 ran 84°. RX 5700 XT blower ran from 76°- 82° depends on website, expect one outlier TPU-92°(dont know Junction or not).
So RX 5700 XT is chilling compared to Nvidia blower. And don't forget thermi, unless you born yesterday.
Here is Nvidia blower temperature : https://www.guru3d.com/articles_pages/amd_radeon_rx_5700_and_5700_xt_review,8.html
Posted on Reply
#68
jmcosta
Zubasa, post: 4097554, member: 30988"
I can understand the fan noise argument.
But where is your evidence of the card actually throttling?
Because if the reference design is really throttling and not boosting to full potential, the Sapphire Pulse wouldn't perform only marginally better even with a factory overclock.
The reference starts to thermal throttling at 90-91C (from 1900mhz to very unstable clocks below 1800mhz) and even shuts down while gaming after a while if its fully utilized (linus and other reviewers have mentioned this)
The reason you don't see a significant boost is because the gain from pushing the frequency is poor in Navi(maybe driver issue?). This chip having an overclock of 15% results in a <4% performance gain
Posted on Reply
#69
Vayra86
Vya Domus, post: 4097464, member: 169281"
Please refrains yourselves from talking about things that you just simply do not know anything about. I don't expect TPU to be brewing with academicians but not complete ignorance either.

Dies do not have uniform thermals across their surface and on certain spots such as where the FPUs sit indeed can reach well over 100C. This has gotten worse over the years as the thermal density of chips keeps rising, you can have TDPs ranging from 10W to 1000W these hotspots will not go away. AMD being on 7nm, again, makes this worse.

Let's spell it out in the simplest of terms so that everyone gets it :

You have two dies, each use 100W and each benefit from the same amount of cooling but one of them is half the size. This inevitably means it will have higher thermal density and will run at higher temperatures, there is no going around it. This of course is taken into account when this things are designed but you can only minimize this effect so much.

Again, this is about the thermal density not TDP, not cooling, nor architecture and you can't really do anything about it. Do not believe for a second Nvidia, Intel or anyone isn't dealing with this. This hot and power hungry meme should die, it has run it's course, now you just look like you don't have a damn clue what you're talking about.
Cool story but the GN Sapphire review proves you wrong. AMD just designed a shit cooler for the heat Navi produces at stock, end of story. Take note of the memory IC temp as well. Red line.



Nuff said, I would say.

You're not wrong about thermal density, its been a problem starting with Ivy Bridge's 22nm, I vividly remember Tomshardware making remarks on it as an explanation for the crappy heat transfer off die. Yet everyone insisted in complaining about shitty TIM instead. We know better now that Intel solders its high end range and still reaches boiling point.

theoneandonlymrk, post: 4097493, member: 82332"
No surely not, no yeah , your spot on , temp offset has been a thing for ages(10 years), now just imagine the real temp of that 9900K die T junction running at 5Ghz eh ,85-100 yeah right.
The real temp of Tjunctiion on Intel has been known for years. I'm not sure what you're trying to say here, other than those K models get really hot, which is absolutely true. But not 110C.

You're also not convincing me that as nodes (and thus transistor size/thickness of materials) get smaller, they can readily handle more heat. I'd say it is quite the opposite.

https://www.intel.com/content/www/us/en/products/processors/core/i9-processors/i9-9900k.html



https://forums.intel.com/s/question/0D50P0000490XQPSA2/thermal-management-for-intel-6th7th-generation-tcase-vs-tjunction?language=en_US

R0H1T, post: 4097495, member: 131092"
That's exactly how things work & there's nothing about charity in my post. You have a choice between 3700x, 3800x & 9900k for let's say gaming. You chose 5~15 fps for ~150$ so the next time you don't get to say why AMD still can't match Intel's clocks or gaming performance! Likewise when you want things for cheap, you don't get to complain that your jobs are shipped overseas. This is how things work & will always do in a profit "driven" world :toast:
They call that a loser's strategy, begging for people to keep coming to the rescue. A winner's strategy is what AMD does for CPU right now. They know how it works. Only the hardened AMD fanbase seems to have trouble grasping that.
Posted on Reply
#70
FordGT90Concept
"I go fast!1!11!1!"
las, post: 4097398, member: 111974"
The GPU is not the only thing using power...

https://www.techpowerup.com/review/galax-geforce-rtx-2060-super-ex/30.html

5700 XT uses more power than 2070 Super in gaming on average, while performing worse. 5700 XT is slower, hotter and louder.
2070 Super also has 3.3 billion more transistors. 5700 XT is being pushed to the limit while 2070 Super isn't, so 5700 XT ends up using more power. 5700 is closer to where nominal Navi 10 perf/watt. 5700 XT is a direct response to NVIDIA RTX launch and AMD's lack of having a bigger chip to compete.
Posted on Reply
#71
las
FordGT90Concept, post: 4097617, member: 60463"
2070 Super also has 3.3 billion more transistors. 5700 XT is being pushed to the limit while 2070 Super isn't, so 5700 XT ends up using more power. 5700 is closer to where nominal Navi 10 perf/watt. 5700 XT is a direct response to NVIDIA RTX launch and AMD's lack of having a bigger chip to compete.
And considering Nvidia uses 12nm things look bad for AMD GPU's... It won't be pretty when Ampere launches at Samsung 7nm EUV or better.

I hope AMD is right about the "Nvidia Killer" they are working on... I believe it when I see it.. Would be awesome.
Posted on Reply
#72
Vya Domus
Vayra86, post: 4097615, member: 152404"
Cool story but the GN Sapphire review proves you wrong. AMD just designed a shit cooler for the heat Navi produces at stock, end of story. Take note of the memory IC temp as well. Red line.



Nuff said, I would say.

You're not wrong about thermal density, its been a problem starting with Ivy Bridge's 22nm, I vividly remember Tomshardware making remarks on it as an explanation for the crappy heat transfer off die. Yet everyone insisted in complaining about shitty TIM instead. We know better now that Intel solders its high end range and still reaches boiling point.
How am I wrong and about what ? I didn't link this in away with how shitty AMD's coolers might be, I said this is a problem that can arise irrespective of cooling. Radeon 7 is proof of that where you wouldn't call it's cooler shitty but it still goes over 100C. And you can also find this on GN where they found out that screwing around with the cooler didn't really make a difference with respects to the Tjunction temperature which still went above 100C.

Vayra86, post: 4097615, member: 152404"
We know better now that Intel solders its high end range and still reaches boiling point.
That also validates my point that this can happen no matter how good the cooling is.
Posted on Reply
#73
Vayra86
Vya Domus, post: 4097631, member: 169281"
How am I wrong and about what ? I didn't link this in away with how shitty AMD's coolers might be, I said this is a problem that can arise irrespective of cooling. Radeon 7 is proof of that where you wouldn't call it's cooler shitty but it still goes over 100C. And you can also find this on GN where they found out that screwing around with the cooler didn't really make a difference with respects to the Tjunction temperature which still went above 100C.



That also validates my point that this can happen no matter how good the cooling is.
Radeon VII is a big die and it is only delivered with... an AMD stock cooler. Common denominator I think looks pretty clear... The simple fact AIBs can get mid- and high TDP cards to temps as much as 15 C lower simply tells us the truth. Another writing on the wall is every Nvidia card from Maxwell onwards. Even their NVTTM shroud doesn't hit this temp, even as it throttles. Its simply not pushed as far. And for Turing we noticed the blower was suddenly gone in favor of more direct cooling.

Also simply look at and compare vcore. Nvidia readily drops vcore as it reaches higher temps, AMD is much more liberal with that. And when you hit throttle point (84C on an Nvidia card and dropping boost biins won't suffice), you get bumped back rigorously, with vcore dropping to below 0,9V.

Vya Domus, post: 4097631, member: 169281"
That also validates my point that this can happen no matter how good the cooling is.
Yes but one does not exclude the other, and you cán run a 9900K in spec at stock and even a little beyond that without needing custom water. Why do you think Intel doesn't deliver a boxed cooler?
Posted on Reply
#74
R0H1T
Vayra86, post: 4097615, member: 152404"
They call that a loser's strategy, begging for people to keep coming to the rescue. A winner's strategy is what AMD does for CPU right now. They know how it works. Only the hardened AMD fanbase seems to have trouble grasping that.
So you're pretending that only good products get to be winners & all bad products or companies lose (customers) :rolleyes:

Must've missed the P4, Atoms or various Nvidia GPUs then, brand name & market position are just as important if not more than the actual product in many cases!
Posted on Reply
#75
Vya Domus
Vega isn't much bigger, we are talking 330 mm^2 vs 250 mm^2 and keep in mind Radeon 7 has some shaders disabled. In the end they're pretty close. But that doesn't even matter, the transistor density is pretty much the same.

As someone else said before Nvidia does not expose these hotspot temperatures so we can't compare them and know with certainty that Nvidia does deal with this as well.
Posted on Reply
Add your own comment