• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

What are latencies like with GDDR5, GDDR5x, GDDR6 and GDDR6x?

Joined
Dec 12, 2020
Messages
1,755 (1.09/day)
I'm not sure if this is the proper forum to ask this in but here goes. What are the latencies like with GDDR5, GDDR5x, GDDR6 and GDDR6x? I'm not here referring to what each IC is rated at but what kind of latencies and end user would see if running, for example, AIDA64's memory benchmark.
 
That depends on a lot of factors. If you can run tighter timings. Increase frequency. Even just the GPU overclock would reduce the latencies. So this number will vary greatly depending on the card its installed on.

Latency between an X and non X version could be an improvement all of only 5ns.

How much does this matter to gaming frame rates?
Generally single digit percentage gains.
 
I'm not sure if this is the proper forum to ask this in but here goes. What are the latencies like with GDDR5, GDDR5x, GDDR6 and GDDR6x? I'm not here referring to what each IC is rated at but what kind of latencies and end user would see if running, for example, AIDA64's memory benchmark.
More or less, as every generation steps up they increase bandwidth but latency gets worse.
Higher clock speeds attempt to keep it similar, but it's simply not.


Eth mining for example was faster on a 1070Ti than a 1080, because the 1070Ti used GDDR5 vs the GDDR5x on the GTX 1080

I cant find the stock values as they aren't listed anywhere (They'd vary between cards) but theres an example here where the 1080 and 1080ti benefited from lowering the VRAM speeds to tighten their timings, while the 1070 cards didn't need this and out-mined them at stock

Optimize Memory Timings on Nvidia GDDR5X GPUs With OhGodAnETHlargementPill | The Crypto Blog (medium.com)



There are no end-user VRAM latency benchmarks out there, especially these days since we can no longer modify VRAM timings in custom BIOS files, but some fancy places have done testing on this over the years

Measuring GPU Memory Latency – Chips and Cheese
GPU Memory Latency’s Impact, and Updated Test – Chips and Cheese

Simply changing the size of what's tested changes the latency, and various GPU's have different methods of accessing that VRAM so you can see why this isn't a simple thing to test
I do not understand the information below, just posting it here as examples from the links
1678686305972.png

1678686353580.png



Hah, i love that the reason for the revised article was a comment from TPU specifying a better way to test this.
 
Thanks Mussels that was really interesting.

Those GDDR latencies certainly do look bad relative to what DDR4 system RAM latencies look like but they're also accessing more data than a typical DDR4 access does (I believe in x86 arch. an entire cache line is always read from memory but that doesn't even amount to 1 KB of data). I think a cache line in the x86 arch. is 32 bytes.
 
Always was curious about comparing GDDR6 latencies to standard DDR4/5. All of the consoles seem to use G6 for system RAM - but tight timings on RAM provides tangible benefit to PC gaming.
 
Thanks Mussels that was really interesting.

Those GDDR latencies certainly do look bad relative to what DDR4 system RAM latencies look like but they're also accessing more data than a typical DDR4 access does (I believe in x86 arch. an entire cache line is always read from memory but that doesn't even amount to 1 KB of data). I think a cache line in the x86 arch. is 32 bytes.
It's 64 bytes. In DDRx, it moves over the 64-bit channel in a burst of 8 transfers (4 clock cycles). DDR5 can either do the same or, alternatively, use 32-bit subchannels, in which case it takes 16 transfers (8 cycles).
It's more complicated in GPUs I think, a cache line is wider (128B?) but Nvidia can fetch smaller units too, and probably AMD has a similar feature.

But what do those latency tests actually measure? Seeing those horribly large numbers for CPUs, I assume it's random access latency, with no sequential access at all. So it should, amount to the sum of the four primary DDR latencies approximately, and it does.
 
Thanks Mussels that was really interesting.

Those GDDR latencies certainly do look bad relative to what DDR4 system RAM latencies look like but they're also accessing more data than a typical DDR4 access does (I believe in x86 arch. an entire cache line is always read from memory but that doesn't even amount to 1 KB of data). I think a cache line in the x86 arch. is 32 bytes.
Cache line size is a property of the microarchitecture. The x86 architecture doesn't specify a line size. For x86 CPUs, the cache line sizes are:
  1. Intel 486: 16 byte
  2. Pentium to Pentium III, AMD K5 and K6, Cyrix 6x86, and Centaur's Winchip: 32 byte
  3. Pentium 4: 64 byte L1, 128 byte sectored L2
  4. All AMD processors since the original Athlon and all Intel processors after the Pentium 4: 64 byte
  5. Haswell and Broadwell eDRAM L4: 128 byte
 
Last edited:
Back
Top