• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Faulty Sensor or Driver?

gamer1988

New Member
Joined
Apr 29, 2012
Messages
7 (0.00/day)
So for a while now, my Sager has been heating up more and more while gaming, and GPU fans would not shut up. Last night I decided to try and take it apart, removed the fans, blew them out with a can of air along with the heatsinks. Put it back together booted it back up, and now its running like new.

Its a laptop with Dual 5870's in CrossFire. 43-47*C on GPU1 on idle when before it would sit about 60*C, and the fans will actually kick down instead of sounding like a jet all of the time. However, GPU2 seems to be kind of faulty. What I mean is I usually monitor it with MSI Afterburner along with my FPS and GPU usage while gaming to make sure its not getting abnormally hot. Within seconds MSI Afteburner said the GPU went from about 47*C to about 80*C within seconds. My first thought was the fan, which I knew I plugged back in. Closed the game and I could hear the fans on. Also put a small piece of tissue under both intakes on the bottom and it was sucked up towards the vent. Fan works.

So I monitored MSI Afterburner and at times it would show my GPU2's temp. was 0*C, which I highly doubt its on and sitting at the freezing point of water... And the fan speed would drop to 20%, which I initially thought was the issue for it running hotter while idle than GPU1. If I open GPU-Z and monitor it, it will show GPU2 is at 65*C or so, when idle. Give it a few seconds and it drop suddenly to 51*C, when GPU1 is running about 48*C. As #2 as usually run a couple degrees warmer than #1, I'm assuming this is the actual temperature, and previous temp. were false.

I had also tried installing different drivers. Previously had 12.3, and found a download for 12.5 beta up, but I dont think it was actually 12.5 as MSI still reported 12.3. I went back to 12.1 which was the last driver version I had before reformatting my laptop a week or two ago.

My question being is it the actual sensor on the GPU itself, or maybe driver software that could be screwing around with my temperature readings? If the temperature slowly declined I would assume its just the fan doing its job, but its an abnormally quick drop of a few degrees within a second, which makes me curious. Dont really know how sensors work in GPUs so I'm asking here. I would reformat again or download a Linux distro and install some temp monitoring software there if I knew it would help. I also played SWTOR on it yesterday while alt tabbing to check HWMonitor (which only shows 1 GPU) and GPU1 only got a little over 60*C after playing for a while, and not any hotter than that. Also the side of the keyboard where my GPUs are under, the backside of the laptop, and underneath are much cooler to the touch.

Any help to resolve this would be much appreciated. I know its not something huge but I would like to still monitor my temperature and get accurate readings. Thanks in advance.
 

Kreij

Senior Monkey Moderator
Joined
Feb 6, 2007
Messages
13,817 (2.20/day)
Location
Cheeseland (Wisconsin, USA)
Welcome to TPU ! :toast:

I'm not that familiar with crossfired laptops, sooo ....

Can you swap the location of the cards in the laptop and see if the problem remains on the same GPU?

When you took apart the cards to clean them out, did you remove the heatsinks and reapply thermal compound?

It's possible that one card has a flakey sensor, or it could be that the temp monitoring software is hitting it at just the right time to get invalid/erroneous readings on occasion (W1zzard would know more about the likelyhood of that occurring).

Just some thoughts.
 

gamer1988

New Member
Joined
Apr 29, 2012
Messages
7 (0.00/day)
I have 2 fans, 1 each sitting on top of the cards. I didnt actually pull out the card itself, just pulled out the fan and the heat sinks for both and blew them out. I could try swapping the placement of the cards if you think that would help give a better idea. After monitoring both in GPU-Z, I'm watching the temperature spike and then drop again, for both cards. I'm actually alt tabbed from SWTOR right now to check temps and this post. One second it will report a 55*C and the next it will jump to 65*C, go up to 66*C and drop back down to 54*C or so.

Also forgot that for some reason on MSI Afterburner (as I usually have it up to increase fan speeds accordingly after read that most of the time AMD has the fan locked to 30% even at higher temps), GPU1 will show the fan percentage where MSI has it, but GPU2 will still show only 20%.

I hope some of that helps as well. I'm signal corps in the Army so I know how difficult it is to troubleshoot something you're not sitting directly infront of, trying to explain and give as much detail as possible. :laugh:

I just edited for a bit of rewording, but with an example in mind:

I was just watching GPU-Z before submitting the edit just to give a better example with actual numbers that just happened. Under sensors GPU Temperature reports a 54*C, in a split second it jumps to 61*C, while still alt tabbed out of the game and only it (GPU-Z) and Firefox open. And thats on card 1. Kind feel I either have really bad luck with 2 weird sensors. Or driver software has some play on how to read sensors and changing drivers back and forth messed with the readings.

Also I dont have any thermal compound, and I've never tried to replace it before, though I have been thinking about doing it after reading an article (actually I think that article was somewhere on TPU, haha).
 
Last edited:

gamer1988

New Member
Joined
Apr 29, 2012
Messages
7 (0.00/day)
Logs from GPU-Z.

I will try to keep it short and only post the relative parts and not every little second, lol.

Date , GPU Core Clock [MHz] , GPU Memory Clock [MHz] , GPU Temperature [°C] , Fan Speed (%) [%] , Fan Speed (RPM) [RPM] , GPU Load [%] , GPU Temp. #1 [°C] , GPU Temp. #2 [°C] , GPU Temp. #3 [°C] , Memory Usage (Dedicated) [MB] , Memory Usage (Dynamic) [MB] , VDDC [V] ,

2012-04-29 10:10:47 , 100.0 , 150.0 , 58.0 , 30 , 180 , 0 , 56.0 , 57.0 , 57.5 , 1378 , 119 , 0.900 ,

2012-04-29 10:10:48 , 700.0 , 1000.0 , 61.0 , 30 , 277 , 0 , 57.0 , 61.5 , 61.0 , 839 , 116 , 1.050 ,

2012-04-29 10:10:49 , 700.0 , 1000.0 , 64.0 , 30 , 277 , 0 , 58.0 , 64.0 , 62.5 , 321 , 87 , 1.050 ,

2012-04-29 10:10:50 , 700.0 , 1000.0 , 65.0 , 30 , 277 , 0 , 58.5 , 65.0 , 63.5 , 412 , 114 , 1.050 ,

2012-04-29 10:10:51 , 700.0 , 1000.0 , 66.0 , 30 , 277 , 0 , 60.0 , 66.5 , 65.0 , 610 , 115 , 1.050 ,

2012-04-29 10:10:52 , 700.0 , 1000.0 , 67.0 , 30 , 277 , 0 , 60.0 , 67.5 , 65.0 , 690 , 116 , 1.050 ,

2012-04-29 10:10:53 , 700.0 , 1000.0 , 67.0 , 30 , 277 , 0 , 60.0 , 67.0 , 65.0 , 745 , 116 , 1.050 ,

2012-04-29 10:10:54 , 700.0 , 1000.0 , 69.0 , 30 , 277 , 0 , 61.0 , 69.0 , 66.5 , 791 , 116 , 1.050 ,

2012-04-29 10:10:55 , 700.0 , 1000.0 , 71.0 , 30 , 277 , 0 , 61.5 , 70.5 , 67.5 , 935 , 116 , 1.050 ,

This gradually increased while running the game, up to about 77°C. A jump from 57-77°C I could understand while playing a game, killing mobs, and my fans arent running. I heard them kick on a higher speed though, so they're definitely on, now heres the weird part when I alt tabbed out.

Date , GPU Core Clock [MHz] , GPU Memory Clock [MHz] , GPU Temperature [°C] , Fan Speed (%) [%] , Fan Speed (RPM) [RPM] , GPU Load [%] , GPU Temp. #1 [°C] , GPU Temp. #2 [°C] , GPU Temp. #3 [°C] , Memory Usage (Dedicated) [MB] , Memory Usage (Dynamic) [MB] , VDDC [V] ,

2012-04-29 10:12:46 , 700.0 , 1000.0 , 77.0 , 30 , 277 , 38 , 66.5 , 77.0 , 73.0 , 1290 , 122 , 1.050 ,

2012-04-29 10:12:47 , 700.0 , 1000.0 , 77.0 , 30 , 277 , 38 , 66.5 , 77.5 , 73.0 , 1290 , 122 , 1.050 ,

2012-04-29 10:12:48 , 700.0 , 1000.0 , 78.0 , 30 , 277 , 38 , 66.5 , 77.5 , 73.0 , 1290 , 122 , 1.050 ,

2012-04-29 10:12:49 , 700.0 , 1000.0 , 77.0 , 30 , 277 , 38 , 66.5 , 77.5 , 73.5 , 1290 , 122 , 1.050 ,

2012-04-29 10:12:50 , 700.0 , 1000.0 , 73.0 , 30 , 277 , 36 , 65.0 , 73.0 , 70.0 , 1274 , 122 , 1.050 ,

2012-04-29 10:12:51 , 700.0 , 1000.0 , 72.0 , 30 , 277 , 29 , 64.0 , 72.0 , 69.5 , 1273 , 122 , 1.050 ,

2012-04-29 10:12:52 , 700.0 , 1000.0 , 71.0 , 30 , 277 , 23 , 64.0 , 71.0 , 69.0 , 1273 , 122 , 1.050 ,

2012-04-29 10:12:53 , 100.0 , 150.0 , 65.0 , 30 , 225 , 0 , 62.0 , 65.0 , 64.5 , 1273 , 122 , 0.900 ,

2012-04-29 10:12:54 , 100.0 , 150.0 , 62.0 , 30 , 189 , 0 , 60.5 , 62.0 , 63.0 , 1273 , 122 , 0.900 ,

2012-04-29 10:12:55 , 100.0 , 150.0 , 61.0 , 30 , 200 , 32 , 60.0 , 61.5 , 62.0 , 1273 , 122 , 0.900 ,

2012-04-29 10:12:56 , 500.0 , 1000.0 , 61.0 , 30 , 200 , 0 , 59.5 , 60.0 , 61.0 , 1273 , 122 , 0.900 ,

2012-04-29 10:12:57 , 100.0 , 150.0 , 60.0 , 30 , 300 , 68 , 59.5 , 59.5 , 61.0 , 1274 , 122 , 0.900 ,

2012-04-29 10:12:58 , 100.0 , 150.0 , 61.0 , 30 , 189 , 0 , 59.0 , 59.5 , 61.0 , 1273 , 122 , 0.900 ,


Now my fans are good, even when it was clogged I could alt tab out and give it a few minutes and the temperature would steadily drop. But this part:
2012-04-29 10:12:52 , 700.0 , 1000.0 , 71.0 , 30 , 277 , 23 , 64.0 , 71.0 , 69.0 , 1273 , 122 , 1.050 ,

2012-04-29 10:12:53 , 100.0 , 150.0 , 65.0 , 30 , 225 , 0 , 62.0 , 65.0 , 64.5 , 1273 , 122 , 0.900 ,

A 6 degree drop within a second? I dont really think thats possible with just a fan... After that it continued to hang around 61°C while still alt tabed out, and declined steadily. 61°C, 60°C, 59°C, 58°C, 57°C. Thats where it got stable so I assume where it said 77°C, I was more like in the 50's to begin with.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,049 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
yes you can drop 6°C in a second when you go from full load to idle and idle clocks as described in your red highlight
 

gamer1988

New Member
Joined
Apr 29, 2012
Messages
7 (0.00/day)
So what about the temperature sensor? Cause I find it hard to believe its at 87°C but the casing still feels room temperature on the outside. Any help would be much appreciated W1zzard. :(
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,049 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
ah, i just realized you are on a laptop.

depends on where you touch the casing. 87°C GPU temperature will probably feel "very warm" to "hot" at the hottest spot on the case. the case temperature will not change instantly.

so far i haven't seen any evidence of temperature sensor misreadings on hd 5800 series
 

gamer1988

New Member
Joined
Apr 29, 2012
Messages
7 (0.00/day)
Yeah, but when my GPU used to get even 70°C, I would be able to feet the heat off to the side of the keyboard even, especially the back and bottom of the laptop. But even with these temperature readings, my laptop is much cooler to the touch, not even the back bottom is warm, so I highly doubt its been getting as hot as it was before I cleaned the heatsink. None of the temperatures make sense. It even spikes suddenly and drops suddenly when idle the entire time, no reason for it to suddenly spike and suddenly drop as well.
 
Joined
Oct 2, 2005
Messages
3,059 (0.45/day)
Location
Baltimore MD
Processor Ryzen 5900X
Motherboard ASUS Prime X470 Pro
Cooling Arctic liquid freezer II 240
Memory 2 x 16 Gb Gskill Trident Z 3600 Mhz
Video Card(s) MSI Ventus 3060 Ti OC
Storage Samsung 960 EVO 500 Gb / 860 EVO 1 Tb
Display(s) Dell S2719DGF
Case Lian Li Lancool II Mesh
Audio Device(s) Soundblaster Z
Power Supply Corsair RM850x
Mouse Logitech G703
Keyboard Logitech G513
Software Win 11
I would check the heatsink mounts to make sure there both making good contact ive seen a lot of laptops where the heat sink assembly starts to warp over time resulting in poor contact between the GPU and heatsink.
It can be easily remedied with a copper shim between the HSF and GPU.
 

gamer1988

New Member
Joined
Apr 29, 2012
Messages
7 (0.00/day)
I pulled it the heatsinks earlier, as I was going to try what the first reply said and swap the cards around. However the ribbon connecting the two cards together seemed like a pain, and I didnt want to try pulling out the ribbon and messing it up. After pulling on it pretty tightly it didnt seem to budge so I'd rather not risk it. The heatsinks werent warped. It's not like the heatsink of the CPU. It's about as long and wide as a pack of cigarettes and the fan sits on top.

Even while idling it will spike up and down as GPU-Z is reporting spike and dips of like a 7-10°C difference within a second. From what I've seen when I first got the laptop and running MSI Afterburner while gaming sure my temps would get like 60°C, and when I alt tabbed it would steadily drop from 60 ... 59 ... 58 ... and so on. I mean sudden drops reported by GPU-Z/Afterburner/Speedfan/HWMonitor within a fraction of a second. Steadily makes sense, but a fraction of a second seems odd.

While googling I've seen something about Mac having a program that can return a code if the temp. sensors are wrong. Does Windows or Linux have something along the lines of that? Or some way to check the the temperature sensor itself? Cooling 7-10°C within a fraction of a second seems pretty odd to me.
 

gamer1988

New Member
Joined
Apr 29, 2012
Messages
7 (0.00/day)
==Update==

I seem to have found the issue thanks to GPU-Z, but I'm unsure of how to actually solve it.

This picture is of GPU #2, while idle. Just booted up, opened nothing but GPU-Z itself.




Same as above, but GPU #1.




Is it supposed to be going up and down all the time while not even doing anything other than having GPU-Z and Firefox open as I type this?

Edit to add: I have found that it is actually a bug from the AMD 12.3 and 12.4 drivers. Will be deleting them and reverting back to an older version. Thanks for the help. Though it wasnt of help for this issue I'll keep the info about the heatsinks in mind. :)
 
Last edited:
Top