Tuesday, August 13th 2019

110°C Hotspot Temps "Expected and Within Spec", AMD on RX 5700-Series Thermals

AMD this Monday in a blog post demystified the boosting algorithm and thermal management of its new Radeon RX 5700 series "Navi" graphics cards. These cards are beginning to be available in custom-designs by AMD's board partners, but were only available as reference-design cards for over a month since their 7th July launch. The thermal management of these cards spooked many early adopters accustomed to seeing temperatures below 85 °C on competing NVIDIA graphics cards, with the Radeon RX 5700 XT posting GPU "hotspot" temperatures well above 100 °C, regularly hitting 110 °C, and sometimes even touching 113 °C with stress-testing application such as Furmark. In its blog post, AMD stated that 110 °C hotspot temperatures under "typical gaming usage" are "expected and within spec."

AMD also elaborated on what constitutes "GPU Hotspot" aka "junction temperature." Apparently, the "Navi 10" GPU is peppered with an array of temperature sensors spread across the die at different physical locations. The maximum temperature reported by any of those sensors becomes the Hotspot. In that sense, Hotspot isn't a fixed location in the GPU. Legacy "GPU temperature" measurements on past generations of AMD GPUs relied on a thermal diode at a fixed location on the GPU die which AMD predicted would become the hottest under load. Over the generations, and starting with "Polaris" and "Vega," AMD leaned toward an approach of picking the hottest temperature value from a network of diodes spread across the GPU, and reporting it as the Hotspot.
On Hotspot, AMD writes: "Paired with this array of sensors is the ability to identify the 'hotspot' across the GPU die. Instead of setting a conservative, 'worst case' throttling temperature for the entire die, the Radeon RX 5700 series GPUs will continue to opportunistically and aggressively ramp clocks until any one of the many available sensors hits the 'hotspot' or 'Junction' temperature of 110 degrees Celsius. Operating at up to 110C Junction Temperature during typical gaming usage is expected and within spec. This enables the Radeon RX 5700 series GPUs to offer much higher performance and clocks out of the box, while maintaining acoustic and reliability targets."

AMD also commented on the significantly increased granularity of clock-speeds that improves the GPU's power-management. The company transisioned from fixed DPM states to a highly fine-grained clock-speed management system that takes into account load, temperatures, and power to push out the highest possible clock-speeds for each component. "Starting with the AMD Radeon VII, and further optimized and refined with the Radeon RX 5700 series GPUs, AMD has implemented a much more granular 'fine grain DPM' mechanism vs. the fixed, discrete DPM states on previous Radeon RX GPUs. Instead of the small number of fixed DPM states, the Radeon RX 5700 series GPU have hundreds of Vf 'states' between the bookends of the idle clock and the theoretical 'Fmax' frequency defined for each GPU SKU. This more granular and responsive approach to managing GPU Vf states is further paired with a more sophisticated Adaptive Voltage Frequency Scaling (AVFS) architecture on the Radeon RX 5700 series GPUs," the blog post reads. Source: AMD
Add your own comment

141 Comments on 110°C Hotspot Temps "Expected and Within Spec", AMD on RX 5700-Series Thermals

#26
R0H1T
Vayra86, post: 4097410, member: 152404"
As long as companies are not continuously delivering perfect releases, we have reason to question everything out of the ordinary, and 110C on the die is a pretty high temp for silicon and the components around it aren't a fan of it either. It will definitely not improve the longevity of this chip, over, say, a random Nvidia chip doing 80C all the time. You can twist and turn that however you like but we are talking about the same materials doing the same sort of work. And physics don't listen to marketing.
Are you conflating engineering with designing chips? While it's arguably true that Intel, Nvidia make better (engineered) chips than AMD, it's certainly not because AMD is incompetent. It could be a myriad of factors beyond their control, like uarch & of course the node which they've chosen. An argument could be made if you had 2 chips made on the exact same node, even then it would boil down do the uarch ~ which isn't as simple as fixing your home.
Xuper, post: 4097418, member: 83814"
The level of Noob/troll in this topic is unbelievable.....that's why I'm more active in anandtech forum.
Interesting, were you there like 3 years back (before Zen) when nearly every AMD supporter was labelled a shill or fanboi? Come 2019 & the IDF has taken an indefinite hiatus, not unlike Hunter X Hunter :laugh:
Posted on Reply
#27
cucker tarlson
my gas stove can reach 300 celcius and it's fine,you guys are spreading FUD
Posted on Reply
#28
Vayra86
R0H1T, post: 4097427, member: 131092"
Are you conflating engineering with designing chips? While it's arguably true that Intel, Nvidia make better (engineered) chips than AMD, it's certainly not because AMD is incompetent. It could be a myriad of factors beyond their control, like uarch & of course the node which they've chosen. An argument could be made if you had 2 chips made on the same exact node, even then it would boil down do the uarch which isn't as simple as fixing your home.
No, I don't conflate anything, I'm a consumer buying a product and I've got a pretty good sense of what's quality and what's questionable. Experience, if you will... whether they designed it wrong or whether it was a bad batch or an unlucky combination of circumstances is entirely not my concern. Neither is having to do all sorts of tweaking to get a product to work as intended or 'comfortably' - this is the reason I still can't see myself buying an AMD GPU these days. Unfortunately - I might add. I'm just not seeing the dedication I'd want and require of a GPU vendor. Because it goes a lot further than the GPU, this is also about continued support, legacy support, how well older APIs and exotic applications work, etc etc etc. AMD is doing the bare minimum and it shows. Every time, in everything they do. Its always late, not quite perfect, or a promise they still need to deliver upon.

The misguided idea that 'because a company engineered and released it' it must be okay has been proven numerous times to be just that - misguided. Never underestimate what the pressure of commercial targets and shareholders will mean for end users.
Interesting, were you there like 3 years back (before Zen) when nearly every AMD supporter was labelled a shill or fanboi? Come 2019 & the IDF has taken an indefinite hiatus, not unlike Hunter X Hunter :laugh:
Haha indeed lol. Anandtech comment section still isn't pretty btw.
Posted on Reply
#29
Zubasa
Jism, post: 4097424, member: 91255"
As for your fancy heat story, VRM's are designed to withstand 110 degrees operating temperature. It's not really the VRM's that suffer but more things like the capacitors sitting right next to it. They have a estimated lifespan based on thermals. The hotter the shorter their mbtf basicly is. I woud'nt recommend playing on a card with a 100 degree vrm where GDDR chips are right next to it either, but it works and there are cards that last out many many years before giving their last frame ever.
A point to add on this. Even on the reference cards the VRMs are not reaching any where near 110C. They are around a modest 78C.
Also the most common points on failure are the solder joins or the Capacitiors of the VRMs.
The GPU and Memory ICs themselves are rarely the first to fail unless they have been overclocked heavily / subject to extremely high voltage.
Posted on Reply
#30
R0H1T
Vayra86, post: 4097431, member: 152404"
I'm a consumer buying a product and I've got a pretty good sense of what's quality and what's questionable. Experience, if you will... whether they designed it wrong or whether it was a bad batch or an unlucky combination of circumstances is entirely not my concern. Neither is having to do all sorts of tweaking to get a product to work as intended or 'comfortably' - this is the reason I still can't see myself buying an AMD GPU these days. Unfortunately - I might add. I'm just not seeing the dedication I'd want and require of a GPU vendor. Because it goes a lot further than the GPU, this is also about continued support, legacy support, how well older APIs and exotic applications work, etc etc etc.
Well then as consumers, not just you per se, how about supporting AMD with more $ especially when they release a competitive (perf/$) GPU? I see many forum dwellers complain about AMD not doing enough against Intel or Nvidia, then they go on & spend 200~400$ more for 5~25% performance, how do you suppose AMD will make money to then make better products ~ magic? AMD has perennially been the budget brand, even when they were superior to Nvidia & Intel, except a brief period with FX chips last decade! Even now people complain about gaming as if people buy 2000$ rigs just for that! If you wanna change the world you have to start with yourself, this applies in every walk of life not just what we're talking about ~ short term pain vs long term gain.
Posted on Reply
#31
Jism
cucker tarlson, post: 4097425, member: 173472"
rubbing eyes

so how many ppl are still running 7970s/r9 2xx cards around here,which are 6-8 years old.
Uh yeah so the product does work then and the faillure rate is'nt that bad as it is being sounded in this thread.

Vayra86, post: 4097426, member: 152404"
You can compare resale value of Nvidia vs AMD cards over the last five to seven years and you'll understand my point. Its almost an Apple vs Android comparison, AMD cards lose value much faster and this is the reason they do. Its too easy to chalk that up to 'branding' alone.
Yes, when the mining craze was going on AMD cards where always favored on top of Nvidia. But i aint going to buy a 3 year old dated, and used card. I never buy used cards. I'm kind of done with that to be honest. Everything in my system is new when i upgrade.

Zubasa, post: 4097434, member: 30988"
A point to add on this. Even on the reference cards the VRMs are not reaching any where near 110C. They are around a modest 78C.
Also the most common points on failure are the solder joins or the Capacitiors of the VRMs.
The GPU and Memory ICs themselves are rarely the first to fail unless they have been overclocked heavily / subject to high voltage.

It's really old news, this. I feel like alot of websites, channels and all that are rebranding old news that was on the net before. Really. VRM's CAN sustain 110 degrees. And they will still run perfectly fine. Here, https://www.techpowerup.com/review/amd-ryzen-9-3900x-tested-on-cheap-b350-motherboard/3.html

50$ motherboard combined with a high-end, 12 core and even overclocked CPU. Runs. And it will proberly run for another year or so if the build quality is just right.

The reason why it runs, and that you see nowhere, is that AMD requires this with mobo vendors. It does'nt want a FX era over and over again where certain boards throttle with a 125W CPU.
Posted on Reply
#32
cucker tarlson
Jism, post: 4097438, member: 91255"
Uh yeah so the product does work then and the faillure rate is'nt that bad as it is being sounded in this thread.
what ?
Posted on Reply
#33
Vayra86
Jism, post: 4097438, member: 91255"
Uh yeah so the product does work then and the faillure rate is'nt that bad as it is being sounded in this thread.



Yes, when the mining craze was going on AMD cards where always favored on top of Nvidia. But i aint going to buy a 3 year old dated, and used card. I never buy used cards. I'm kind of done with that to be honest. Everything in my system is new when i upgrade.
Yes, you have already explained why, because the stuff you buy is not going to last longer anyway, so why would you expect that from a second hand purchase.

Meanwhile, I get about 150-200 EUR returned on every GPU upgrade which allows me to buy into same or higher tier without 'spending more' than I did on the previous card. Every time. I've made about 1200 EUR on GPU sales for personal use. You enjoy your 3 year cards, to each his own, its good these furnaces still have a market I guess.

Do check out that GN review of the Sapphire though, it nicely underlines the point, even memory ICs get to boiling point which is definitely not where you want them. I vividly remember the EVGA GTX 1070 FTW - another one of those cards 'that was just fine' until EVGA deemed it necessary to supply thermal pads after all and revise their product line and shroud entirely.

Anyway, non issue because it was already clear that you had to stay far away from the reference designs.

R0H1T, post: 4097435, member: 131092"
how about supporting AMD
That's not how commerce works, that is how charity works. And not a single charity exists to solve problems, but rather to preserve them to cash in even more.

If AMD can't compete, we need a new player. I hear Intel is working on something. And if AMD GPU business falls flat (which it will eventually if they keep at it like this) someone will buy the IP and take over the helm. I'm not worried and I don't root for multinationals.
Posted on Reply
#34
Marecki_CLF
Hello Everyone,

My $0.02:
Please find below a GPU-Z screenshot taken after Fire Strike Extreme Stress Test run on my reference (Sapphire) RX5700XT.
The card is set in Wattman to boost up to 1980MHz at 1006mV (with GDDR6 mildly OCed from 875MHz to 900MHz). It runs with these settings just fine. Performance is very satisfactory (I have a 2560x1440 144Hz display), temps are in check, fan noise is barely audible. Just so you know, I have a mATX case, it does not have very good airflow.
I have no idea why RX5700XT runs by default at 1203mV. IMHO this is very high and is the culprit of reference cards running hot, loud and being power hungry. From what I've seen so far, all reference cards can be undervolted by a huge margin, which resolves all heat/noise/power consumption issues.
For the sake of comparison, reference RX5700 (non-XT) runs at 1025mV.
Food for thought.

Posted on Reply
#35
Jism
The reason why they put the voltages higher then usual or what seems to be the sweetspot for those chips, is because binning. A higher voltage allows for a bigger extraction of chips on a single wafer. So you might be lucky, have a chip that has a better density and requires a overall lower voltage compared to the rest. But it's always within silicon spec. They wont release a GPU that's running beyond of what it's capable and what is considered the safe zone.

My RX590 goes from 1150mv back to 1110mv. It can do 1090mv but at the cost of a crashing radeon relive. So i stick it at 1110mv with a core of around 1450mhz which is still very good.
Posted on Reply
#36
cucker tarlson
UV requires binning just as much as OC does,seen Vega users saying their cards crash if they as much as touch undervolting,yet it's commonplace to see people say all Radeons can undervolt substantially.Well if they could there'd be no rreason for AMD to set higher voltage in the first place.
I'd rather take a shot at overclocking a card that runs great out of the box than undervolting one that needs it badly.
Posted on Reply
#37
lZKoce
Vayra86, post: 4097431, member: 152404"
The misguided idea that 'because a company engineered and released it' it must be okay has been proven numerous times to be just that - misguided. Never underestimate what the pressure of commercial targets and shareholders will mean for end users.
+1, you don't want to know what happens in automotive industry :D....just ask Ford about transmissions on Fiestas....or Volvo about that plastic piece on the fuel lines.... :P
Posted on Reply
#38
Vya Domus
Please refrains yourselves from talking about things that you just simply do not know anything about. I don't expect TPU to be brewing with academicians but not complete ignorance either.

Dies do not have uniform thermals across their surface and on certain spots such as where the FPUs sit indeed can reach well over 100C. This has gotten worse over the years as the thermal density of chips keeps rising, you can have TDPs ranging from 10W to 1000W these hotspots will not go away. AMD being on 7nm, again, makes this worse.

Let's spell it out in the simplest of terms so that everyone gets it :

You have two dies, each use 100W and each benefit from the same amount of cooling but one of them is half the size. This inevitably means it will have higher thermal density and will run at higher temperatures, there is no going around it. This of course is taken into account when this things are designed but you can only minimize this effect so much.

Again, this is about the thermal density not TDP, not cooling, nor architecture and you can't really do anything about it. Do not believe for a second Nvidia, Intel or anyone isn't dealing with this. This hot and power hungry meme should die, it has run it's course, now you just look like you don't have a damn clue what you're talking about.
Posted on Reply
#39
Dave65
Not that it is our job to fix factory built good, but the washer mod, new thermal pads and some kryonaut does wonders for these cards. My neighbor said them thermal pads are the cheapest, low cost garbage you can put on how components.
If you go down one size on the memory thermal pads from 1.5mm to 1mm it closes the gap between the die and cooler. It really does work.
Posted on Reply
#40
Midland Dog
las, post: 4097395, member: 111974"
Most Nvidia cards are cool and quiet for a reason, lower temps overall.
less heat = less leakage = more efficiency
Posted on Reply
#41
las
ZoneDymo, post: 4097405, member: 66089"
and its 100 - 150 dollars cheaper.... so why are you comparing the two?
If anything you should compare it to the RTX2060 Super (like in your link...was the 2070 a typo?) and then the 5700XT is overall the better option.
Custom vs Custom and 2060 Super and 5700 XT performs pretty much the same. Not sure why you think 5700 XT is the overall better option. That entirely depends on games played. On average 2060 Super overclocks better than 5700 XT. It looks like 5700 XT has next to zero OC headroom, just like on AMD's CPU's.

AMD officially said they will max all their chips, instead of leaving some in the tank for the "few percent" that overclock. Looks like this holds true when looking at Ryzen and the 5700 XT custom cards. 2-3% performance gained with max OC. The Asus Strix gained 0.7%...

londiste, post: 4097474, member: 169790"
5700XT is cheaper.
Not really... You can get reference 5700 XT for 10 bucks less than custom 2060 Super... You need hearing protection with the 5700 XT ref tho

You get Control and Wolfenstein with all 2060 Super which can easily be sold.
Posted on Reply
#42
londiste
las, post: 4097467, member: 111974"
Custom vs Custom and 2060 Super and 5700 XT performs pretty much the same. Not sure why you think 5700 XT is the overall better option.
I wanted to write that RX 5700 XT is cheaper but it turns out right now RTX 2060 Super has a slight edge in prices.
Posted on Reply
#43
las
londiste, post: 4097476, member: 169790"
I wanted to write that RX 5700 XT is cheaper but it turns out right now RTX 2060 Super has a slight edge in prices.
A friend of mine bought custom 2060 Super for $399 with free delivery and sold the gamekeys for 50 bucks
Posted on Reply
#44
Dave65
las, post: 4097478, member: 111974"
A friend of mine bought custom 2060 Super for $399 with free delivery and sold the gamekeys for 50 bucks
It's always, a friend:roll:
Posted on Reply
#45
cucker tarlson
Dave65, post: 4097490, member: 82235"
It's always, a friend:roll:
isn't it what they cost ?

cheapest non-reference 5700xt here is 2200pln for pulse,2060 S is 1800 PLN for zotac/pny/gainward dual fan + 2 games worth 300 pln total
Posted on Reply
#46
theoneandonlymrk
Vayra86, post: 4097413, member: 152404"
No you misunderstand, none of this is true and everybody does this, you just never saw it because AMD is the only one doing temp sensors right...

:roll::roll::roll::roll::roll::roll::roll::roll::roll::roll:
Seriously people.
No surely not, no yeah , your spot on , temp offset has been a thing for ages(10 years), now just imagine the real temp of that 9900K die T junction running at 5Ghz eh ,85-100 yeah right.
Posted on Reply
#47
cucker tarlson
theoneandonlymrk, post: 4097493, member: 82332"
No surely not, no yeah , your spot on , temp offset has been a thing for ages(10 years), now just imagine the real temp of that 9900K die T junction running at 5Ghz eh ,85-100 yeah right.
oh look,a squirrel!
Posted on Reply
#48
R0H1T
Vayra86, post: 4097441, member: 152404"
That's not how commerce works, that is how charity works. And not a single charity exists to solve problems, but rather to preserve them to cash in even more.
That's exactly how things work & there's nothing about charity in my post. You have a choice between 3700x, 3800x & 9900k for let's say gaming. You chose 5~15 fps for ~150$ so the next time you don't get to say why AMD still can't match Intel's clocks or gaming performance! Likewise when you want things for cheap, you don't get to complain that your jobs are shipped overseas. This is how things work & will always do in a profit "driven" world :toast:
Posted on Reply
#49
lynx29
I won't have to worry about this with my 3 fan 5700 XT arriving Friday, also this will run around 2100 core matching 2070 super in most games across the board (if the Asus 3 fan version review on guru3d is anything to go by).

Also, that Asus 3 fan 5700 XT comes within 10 fps of 2080 SUPER on a few games, Sekiro at 1440p being one. My gigabyte 3 fan should do the same, not bad for $420. :)
Posted on Reply
#50
Axaion
Yeah no thanks AMD, i dont wish to have hearing damage because of your poor cooler design
Posted on Reply
Add your own comment