• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

@Devs What is "GPU Temperature (Hot Spot)" on RX Vega?

Status
Not open for further replies.
Joined
Sep 7, 2017
Messages
5 (0.00/day)
Location
Austria
Processor i5-4670K @ 4,5Ghz
Motherboard Asus Z87-Pro
Cooling Custom Watercooling
Memory 16GB DDR3-1600 CL8
Video Card(s) Sapphire RX Vega 56@EKWB
Storage Samsung SSD 840 240GB + Samsung SSD 840EVO 500GB + 2x Crucial MX300 525GB Raid0
Display(s) AOC G2460PF
Case Corsair Obsidian 750D
Audio Device(s) Soundblaster Zx
Power Supply Corsair HX850i
Mouse Logitech G502
Keyboard Victsing Mechanical
Software Windows 10 Pro
I guess the title already says it. I would like to know from which sensor exactly that data is being pulled from.
 
I directed my question directly to the devs in order to avoid getting distracting answers. I know you meant nothing but good but to be honest pointing me to Guru3D didn't help at all, as this is a question only the programmer of GPU-Z can answer. He is the one who picked the sensor with his own code after all..
 
It's a sensor inside the GPU silicon. Probably (going by the name9 at a location where it gets hottest. That's all I know.

You also asked about HBM temperature location in a post that's now deleted due to cleanup. No idea again. The card gives me a sensor "HBM temperature" that's all I know
 
He's not the only one who knows what it means. It was implemented by AMD. All he did was make it so his program shows the data reported by it. Frankly, as he's not a GPU designer/engineer, I'd be pretty surprised if he knew anything at all about it. It was rather stupid in my view for the "who-the-hell-ever-he-is" guy over at AMD to suggest you ask the dev of GPU-Z to explain how/what/where/why AMD designs the temperature sensors on their GPUs. And rather smart for you to reply "But isn't this temperature value something that is provided by the card's firmware?". Since it's pretty obvious that's the case.

He just replied as I was typing this. Good to know I wasn't wrong about that.
 
Thanks for your replies then. I was asking that particular person on the AMD forums as hes the only one from the staff i know who at least answers when mentioned in a post. I guess I'll just nag them until they provide an answer lol.
 
He's not the only one who knows what it means. It was implemented by AMD. All he did was make it so his program shows the data reported by it. Frankly, as he's not a GPU designer/engineer, I'd be pretty surprised if he knew anything at all about it. It was rather stupid in my view for the "who-the-hell-ever-he-is" guy over at AMD to suggest you ask the dev of GPU-Z to explain how/what/where/why AMD designs the temperature sensors on their GPUs. And rather smart for you to reply "But isn't this temperature value something that is provided by the card's firmware?". Since it's pretty obvious that's the case.

He just replied as I was typing this. Good to know I wasn't wrong about that.
I believe W1zzard knows more than you think, I believe he use to work for ATI back in the day.;)
 
I believe W1zzard knows more than you think, I believe he use to work for ATI back in the day.;)
Oh I know he knows his stuff. And most certainly a lot more than I know. I was just guessing he wouldn't know about this particular feature on Vega. Which, to the best of my knowledge, is entirely new and only found on Vega. And hasn't been mentioned in any documentation(that I've seen). So unless he actually worked on it(which I figured I'd have heard about if he did)...it just didn't seem likely he'd know any more than the rest of us. Which is nothing...yet.
 
he wouldn't know about this particular feature on Vega. Which, to the best of my knowledge, is entirely new and only found on Vega.
which feature?
 
AMD GCN silicon gets hot very quick, and by my experience in most generations of MAD GPUs starting from 5xxx all the way up to RX 4xx and 5xx, max temps go all the way up to 90C if not in a proper ventilated case and room temp.
Most good cooled AMD cards rely on ambient temp, good case and added fans to work in good temps.
Mine doesn't go above 70C with 4 extra fans in case, and air conditioning in room on a cool 23C in summer, in winter no need for air conditioning, just open a window.
TJ max would be 100C by my exp, but optimal temps would be around 65 - 75C.
 
It's a sensor inside the GPU silicon. Probably (going by the name9 at a location where it gets hottest. That's all I know.

You also asked about HBM temperature location in a post that's now deleted due to cleanup. No idea again. The card gives me a sensor "HBM temperature" that's all I know
Sounds a lot like something I said Bossman Ty.
 
which feature?
The GPU hot spot temp sensor...I guess? Which may or may not qualify as a "feature". I might have worded that poorly. As well as everything else I've said in this thread. I probably should have just kept my mouth shut. :oops:

I am sort of curious about it though. My question at this point is, is it only found on Vega? I noticed yesterday while using Polaris Bios Editor that there's a "Hotspot Temp (C)" value under POWERTUNE for Polaris 20, Ellesmere, Baffin, and Lexa. Which makes me think there's got to be a sensor for it on those too.
 
Has been there for a while, exposed just now.
 
Would it theoretically be possible to add support for reading VR_SOC and VR_MEM temps on Vega with GPU-Z?
 
HBM has built-in thermal sensor from what I understand from the datasheet. So should there not be two thermal readings one for each HBM die?
 
HBM has built-in thermal sensor from what I understand from the datasheet. So should there not be two thermal readings one for each HBM die?

Sounds like that would be the same as the memory temperature, no?
 
... how do you expect a random software developer to know how a hardware manufacturer, that they have no relationship with, exposes its hardware's sensor data? We can't smell these things y'know, we're just as dependent as anyone on the hardware company providing documentation on where that sensor data lives in memory, how to access it, and how to interpret it into a number that actually makes sense to an end-user.

I mean, yeah, you could spend hours peeking and poking through various memory locations to guess at this stuff... or you could save yourself a ton of time and effort and just use what the manufacturer provides... I know which one I go with.
 
Sorry it's off topic but I replied to this thread like originally post two ish early on and its gone , no insult just info , please find a better path ,editing out my help will stop me helping.......
As it's out and out offensive , i told the Op what it was exactly and even before w1zzard , i have a vega and I know that stuff.
Required a dev tut .

Seen the thread here today i thought it new since its been cutting edited
 
or you could save yourself a ton of time and effort and just use what the manufacturer provides

Unless you've seen something different, the only thing I've seen them (AMD) provide is the GPU Core temp. It's up to 3rd parties (i.e. GPU-Z) to read and display other sensor info that's been exposed.

The OP's question was about the meaning of the "hot spot" temp sensor... and it sounds like AMD hasn't given much info on the significance of that value, or where the sensor is located.
 
Unless you've seen something different, the only thing I've seen them (AMD) provide is the GPU Core temp. It's up to 3rd parties (i.e. GPU-Z) to read and display other sensor info that's been exposed.

The OP's question was about the meaning of the "hot spot" temp sensor... and it sounds like AMD hasn't given much info on the significance of that value, or where the sensor is located.
The sensor? Is all the temp sensors , it's the hottest spot, vega is built on infinity fabric which is a bus and control network including sensor's and each chip has its own additional sensors but the hot spot in such chip terms is the hottest spot.
And is king of the thermal throttle hill so to speak, as i previously said ,ish.
 
And is king of the thermal throttle hill so to speak

Does the hot spot factor into thermal throttling? It doesn't seem to, if I'm hitting 97c with mine... the GPU throttle is set at 85c, and I think the Mem throttle is at 85c also. Overclocked, my hotspot was 97c. Core was 75c and Mem at 85c ... that was at a core speed topping out at 1733 and mem at 1050 (according to GPU-Z). Core was undervolted to 1050 Mv for both P6 and P7
 
Sounds like that would be the same as the memory temperature, no?

Have you read HBM Memory PDF? or have I misunderstood something. It does not look right if each HBM die has it's own built-in thermal sensor.

I would expect something like HBM thermal 0 & HBM thermal 1 (example), but just one HBM thermal temperature reading to cover both die.

What if one of HBM memory die was making poor contact, how would you know which one?

HBM1 also has built-in thermal sensor if my memory serves me well.

If I am missing or misunderstood something, can someone please post more technical details.
 
Last edited:
Have you read HBM Memory PDF? or have I misunderstood something. It does not look right if each HBM die has it's own built-in thermal sensor.

No, I haven't read the data sheets. Yes, as far as I know, there are two HBM chips on the GPU chip, I assume they have a sensor only on one, or only expose information for one.

On another note, my Vega 64 started throwing out some weird readings:

1522189368981.png
 
No, I haven't read the data sheets. Yes, as far as I know, there are two HBM chips on the GPU chip, I assume they have a sensor only on one, or only expose information for one.

On another note, my Vega 64 started throwing out some weird readings:

View attachment 98874

You can't have a sensor on just one HBM, both should be connected. You have separate dies & for safety/monitoring, each HBM die has it's own Thermal features. " take a glance over at the JEDEC PDF Docs", that's what I did.

If one HBM is overheating how would you know this is happening if it taking a reading from the other. Your CPU has thermal reading for each core, HBM is no different. The ability to monitor each die is important. Fuji chip has four HBM stack, so you should be seeing Thermal 0 to Thermal 3.

You can't just have one thermal read-out for all HBM die when connected to the main Vega/Fuji die, that's not how things are done.
 
Last edited:
Status
Not open for further replies.
Back
Top