• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Nvidia Tesla T4 cards incorrectly report 100% memory utilization

bsee-ino

New Member
Joined
Dec 15, 2020
Messages
3 (0.00/day)
GPU-Z reports 100% memory utilization for Tesla T4 cards. Monitoring the same card with Nvidia SMI reports the correct usage. Confirmed in GPU-Z v2.36.0 (latest). This is not a new issue.
 
Interesting, any chance I could use Remote Desktop or Teamviewer to check out the problem and try a few debug builds?
 
Same issue as mentioned here:
 
I can't give an remote session, sorry. This is being used for a business. However, the thread StefanM linked is exactly the same. This server is a Gigabyte, just like the motherboard in that thread. I'm not sure how GPUz queries the memory usage, so not sure if the motherboard is relevant at all. The issue occurred in multiple servers using both AMD & Intel cpus.
 
Oh, this is VRAM usage, sorry I assumed you meant "Memory Controller Load".

Looks like an overflow indeed, let me test on other cards with around 16 GB memory or more

Edit: tested on RTX 3090 (24 GB) and RX 6800 XT (16 GB) and works for me.

Which Windows version do you use? If Windows 10, which build?
 
You can also double-check with task manager->performance->GPU

 
I'm running windows Server 2019, build 1809. This was also an issue in whatever version was before 1809.
gpuz memory issue.PNG
 
I can confirm this issue. Installed a T4 onto my Supermicro X10DRLI-I motherboard today and the memory usage constantly shows 15360 MB usage.
 
I second this issue, however I am running a tesla M40 12 GB with an RX480 for display output. SMI reports correct memory usage, however gpu-z 2.38 reports 11519 MB of VRAM usage from startup. The M40 is not recognized by task manager, CPUID HWMonitor, or CPU-Z and afterburner displays 11520 MB usage from startup.
Interesting, any chance I could use Remote Desktop or Teamviewer to check out the problem and try a few debug builds?
I am am fine with having a Teamviewer session to try debug builds.
 
After some experimentation, I believe I have pinpointed the cause of the issue. By default, the Nvidia drivers use TCC mode instead of WDDM mode. If I change the mode using nvidia-smi.exe -i 0 -dm 0 to WDDM, GPU-Z displays the correct memory usage as expected.

With WDDM:
unknown.png

unknown.png

With TCC:
unknown.png

unknown.png
 
Thanks to @fffffgggg54 I now understand the issue.

The NVIDIA driver function that I'm using to get the available VRAM size does not work in TCC mode. Obviously nvidia-smi works (which uses NVML), so now I'll try to figure out how NVML gets the VRAM use and use that mechanism for GPU-Z

For next GPU-Z release I'll disable the VRAM usage sensor on all cards in TCC mode, until a solution is found
 
Back
Top