• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Can one trust GPU-Z ‘GPU hot spot’ and ‘memory junction’ figures?

4EvrYng

New Member
Joined
Apr 12, 2020
Messages
16 (0.01/day)
I have in front of me three EVGA 2080 Super FTW3 Hybrid cards. When I stress test them with TimeSpy Extreme GPU-Z reports GPU temperature and all ICX temperatures (GPU, memory, and power) are within 2C across all three -BUT- hot spot and memory (junction?) on one of them are reported almost 10 C higher than on other two (see table below).

SensorCard 1Card 2Card 3Max-Min
GPU53.8 C54.9 C55.4 C1.6 C
GPU hotspot65.2 C67.0 C73.7 C8.5 C
Memory64.4 C66.7 C73.7 C9.3 C
ICX GPU 153 C54 C55 C2 C
ICX GPU 252 C52 C52 C0 C
ICX Mem 152 C51 C51 C1 C
ICX Mem 256 C55 C55 C1 C
ICX Mem 354 C53 C53 C1 C
ICX Pwr 145 C45 C44 C1 C
ICX Pwr 248 C47 C47 C1 C
ICX Pwr 348 C47 C47 C1 C
ICX Pwr 448 C47 C47 C1 C
ICX Pwr 556 C54 C54 C2 C

I would think if GPU and memory on one of them are really running 10C hotter that would be reflected on rest of the sensors but I don’t see that, I see it only on hot spot and memory junction.

That made me start researching and I have come across following (please correct me if I am wrong):
  1. Allegedly GPU-Z support for “hot spot” and/or “memory junction” applies only for 30 series cards and 20 series owners can’t rely on reported values.
  2. EVGA’s Jacob Freeman states that temperatures from ICX sensors are closer to the actual running temperatures (see ).
  3. Jacob is further pointing people to thread https://forums.developer.nvidia.com...erature-via-nvidia-smi-or-nvml-api/168346/160 where Nvidia’s moderator states “The memory case temperature is not exposed by any third-party tools authorized by NVIDIA on Windows or Linux. Existing third-party tools appear to be reporting numbers that do not represent the relevant case temperature (Tc) specification and it’s normal for other readings to show higher values.”
That could explain why I can’t find memory junction temperature in HwInfo64 and is casting doubt on reliability of figures reported by GPU-Z for those two sensors

So can one trust GPU-Z ‘GPU hot spot’ and ‘memory junction’ figures for 20 series cards? Can somebody explain why values I am seeing on card # 3 for those two sensors are not falling in line with trend on rest of sensors?
 
Joined
Aug 20, 2007
Messages
20,787 (3.41/day)
System Name Pioneer
Processor Ryzen R9 7950X
Motherboard GIGABYTE Aorus Elite X670 AX
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory 64GB (4x 16GB) G.Skill Flare X5 @ DDR5-6000 CL30
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage 2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64
I dunno re the validity of the figure, but thought I'd point out those temps are fine, regardless.
 
Joined
Mar 26, 2012
Messages
221 (0.05/day)
System Name Mixed Bag of OC
Processor AMD Ryzen 5800X3D
Motherboard Maxsun MS-iCraft B550M WIFI
Cooling CPU+GPU on Water with 3 X 420 Rad´s
Memory 32GB Patriot Viper RGB @ 3800 Mhz CL14
Video Card(s) XFX Merc 310 RX 7900 XTX
Storage 2TB Kingston Fury + 2TB Samsung PCIe 4 NVME
Display(s) Philips 48OLED806
Case Selfmade Huuuuuge *Case* :)
Audio Device(s) ifi Zen DAC + Monoprice M1060C & Burmester Replica AMP + Selfmade Huuuuuge Speakers :)
Power Supply Seasonic PRIME TX-750
Mouse Kensington Slimblade (main device) + Razer Basilisk V3 (for FPS)
Keyboard Sharkoon PureWriter RGB, Kailh Blue switches
VR HMD None
Software Windows 11
Benchmark Scores do not matter, my PC is fast :)
ICX uses external Sensor Probes so the ICX can not see the Hotspot Temps and yes the Hot Spot Temp can vary a bit between cards because it is the hottest sensor on one die that is reporting the hotspot temp.
as Gpu silicon (ASIC) Quality is different on each card, there can be differences on what part of the die has the most (leakage) and gets the hottest.

On the memory side, i am not that shure, i know that in HWinfo64 the MemoryJunction temp Sensor is directly read from the chips themself but for GPU´z i don´t know.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,049 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
GPU-Z reads hotspot and memory temperature through NVIDIA's own APIs

Maybe one of the memory chips doesn't have optimum contact? I think memory temperature is the highest temperature of all chips

iCX has physical sensors that are placed at specific locations across the board. It doesn't measure every single memory chip.

Do more testing, will be interesting to see your findings
 

4EvrYng

New Member
Joined
Apr 12, 2020
Messages
16 (0.01/day)
GPU-Z reads hotspot and memory temperature through NVIDIA's own APIs

Are you saying that Nvidia's statement I've linked to is incorrect, please?

Also, I've read that sensors on some of GPUs Nvidia shipped might not be calibrated and that there is a way to check through API whether they were. Does GPU-Z check whether sensors were calibrated before reading out or it always assumes they were?

Maybe one of the memory chips doesn't have optimum contact? I think memory temperature is the highest temperature of all chips

iCX has physical sensors that are placed at specific locations across the board. It doesn't measure every single memory chip.

Thought did cross my mind. However, if that was the case wouldn't 10C increase in internal temperature result in at least some upward trend on at least one of ICX sensors even though they are external?

I've read that it might be possible to get temperatures of all internal GPU sensors through Nvidia's API, not just the hottest one. If yes would it be possible to get that whole list though GPU-Z? Having all of them would give users an idea are they having sub-optimal thermal interface application / is their thermal interface getting worse over time.

Do more testing, will be interesting to see your findings

Do you meant test with other software besides 3DMark? If yes I already did that and exact same behavior can be observed regardless of what I use to test. If not please let me know how you would like me to test.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,049 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
Are you saying that Nvidia's statement I've linked to is incorrect, please?
I think so. I'm using their own API and return the untouched value. Of course GPU-Z is not "authorized" to use their internal only methods to report memory temperature

Have you reassembled the cooler of card 3 yet? It's probably just uneven pressure or some other mounting issue
 

4EvrYng

New Member
Joined
Apr 12, 2020
Messages
16 (0.01/day)
I think so. I'm using their own API and return the untouched value. Of course GPU-Z is not "authorized" to use their internal only methods to report memory temperature

Have you reassembled the cooler of card 3 yet? It's probably just uneven pressure or some other mounting issue

Does their API report whether sensors have been factory calibrated? Does GPU-Z assume sensors have been always calibrated?

Personally I haven't done anything to 3rd card. I can't know did previous owner do anything but EVGA factory stickers on card are in perfect condition, just the way they come from factory, no signs of any tampering/work.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,049 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
Nobody calibrates their sensors. I still find it highly unlikely that they are off that much.

You're not getting any new answers until you take apart the card and repaste it

Or blame GPU-Z and be done with it, the card is fine either way
 

4EvrYng

New Member
Joined
Apr 12, 2020
Messages
16 (0.01/day)
Nobody calibrates their sensors. I still find it highly unlikely that they are off that much.

You're not getting any new answers until you take apart the card and repaste it

Or blame GPU-Z and be done with it, the card is fine either way

I'm not looking to cast blame in any particular direction, I'm just trying to figure out what might be going on as there seems to be a contradiction between values :)
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,049 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
I'm not looking to cast blame in any particular direction, I'm just trying to figure out what might be going on as there seems to be a contradiction between values :)
Rather I'd say it's an indicator of a physical issue with the 3rd card

You realize that if the cooler is tilted, the temperatures on one side of the die will be higher due to lack of contact, and along that same plane the contact with memory chips will be MUCH worse?
 
Last edited:

4EvrYng

New Member
Joined
Apr 12, 2020
Messages
16 (0.01/day)
Rather I'd say it's an indicator of a physical issue with the 3rd card

You realize that if the cooler is tilted, the temperatures on one side of the die will be higher due to lack of contact, and along that same plane the contact with memory chips will be MUCH worse?
While tilting of cooler didn’t cross my mind (thank you for reminding me of that possibility) I did consider mounting imperfections as possible cause, it is just that I felt internal 10C rise in one spot would be likely reflected in upward trend on at least one of ICX sensors, even if that change is in lesser amount, instead of absolutely no change.

Not having a possible explanation for lack of upward trend on ICX sensors means I can’t with 100% certainty rule out possibility something could be affecting accurate readout of sensors. In turn that means only way to answer with certainty what might be going on is to inspect and redo cooling of card, like you suggested.

I’ve given that a thought. Third card seems to be passing all tests I am aware of without any obvious issues (OCCT’s GPU and VRAM, 3DMark’s Time Spy Extreme and Fire Strike Ultra … please feel free to suggest any others) and maximum temperatures during those tests doesn’t exceed values I mentioned, which are (it is my understanding) still well within levels these components should be able to do without issues. Last, but not least, card is still well within warranty period. So I have decided against it as I would be opening a can of worms and spending lots of time on it without need or benefit that would justify it.

Thank you again for your help!
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,049 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
it is just that I felt internal 10C rise in one spot would be likely reflected in upward trend on at least one of ICX sensors, even if that change is in lesser amount, instead of absolutely no change.
This is exactly what will happen with a tilted cooler, that's why hotspot is so useful (to diagnose cooling problems, it's quite useless otherwise unless the card thermally throttles)
 

4EvrYng

New Member
Joined
Apr 12, 2020
Messages
16 (0.01/day)
This is exactly what will happen with a tilted cooler, that's why hotspot is so useful (to diagnose cooling problems, it's quite useless otherwise unless the card thermally throttles)
I trust you. It is just that my logic was "increase in internal heat will spread to area in its vicinity, thus in turn it should spread to external sensors in its vicinity, and in turn those external sensors should show some heat increase too".
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,049 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
I trust you. It is just that my logic was "increase in internal heat will spread to area in its vicinity, thus in turn it should spread to external sensors in its vicinity, and in turn those external sensors should show some heat increase too".
 

4EvrYng

New Member
Joined
Apr 12, 2020
Messages
16 (0.01/day)
Yes, but that is 2C variance between best and worst internal GPU sensors, ICX GPU 1 also shows 2C variance, and ICX GPU 2 shows 0C, while hot spot shows almost 9C. Same with memory. So I don't see trend of spreading.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,049 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
That's not how physics works, your cards will be fine. Ignore the issue, I'm done here
 
Top