• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

NVIDIA RTX 4070 Ti SUPER with 26 Gbps G6X Mod Beats RTX 4080

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,783 (7.41/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
The graphics card modders from Brazil, TecLab and Paulo Gomes are back with yet another audacious, competitive VGA mod. The duo has already built a reputation for video memory replacement mods that have significantly altered the performance profile of graphics cards, and their latest feat sees an NVIDIA GeForce RTX 4070 Ti SUPER graphics card get a bit of a memory upgrade. Out of the box, the RTX 4070 Ti SUPER comes with 16 GB of 21 Gbps GDDR6X memory across the 256-bit wide memory interface of the "AD103" GPU. The two tech-tubers each performed memory chip replacement mods, which in combination with GPU overclocking, resulted in memory speeds ranging between 24 Gbps and 26 Gbps. The big story here is that with faster memory, the RTX 4070 Ti SUPER beats the RTX 4080, despite 13% fewer shaders and other key components.

A stock GeForce RTX 4080 graphics card scores 8525 points in the Unigine Superposition 8K benchmark, and a stock-speed Manli RTX 4070 Ti SUPER does 7212 points, on account of fewer shaders (66 SM vs. 76 SM) and slower memory (21 Gbps vs. 22.4 Gbps). With a 24 Gbps memory speed mod, and GPU overclocking, Paulo Gomes achieved 8870 points in the test for the RTX 4070 Ti SUPER, beating the RTX 4080, and overcoming the shader deficit. Meanwhile, over at TecLab, their Galax-branded RTX 4070 Ti SUPER yields 7028 points at stock speeds; and the team pulled off an epic 26 Gbps memory speed (chip replacement + overclocking), which when combined with some GPU overclocking, yielded a staggering 9133 points, which would inch close to what an RTX 4080 SUPER could produce with its 23 Gbps memory and maxed out "AD103" ASIC with all 80 SM on deck.



View at TechPowerUp Main Site | Source
 
Error in Title.
The in Article it is said that the 4070Ti Super beats the 4080, but not the 4080 Super.

I hope Nvidia will use the faster 32gbps GDDR7 memory for the 5090. :D
 
Am I correct in saying that Nvidia GPUs are bandwidth starved but not AMD GPUs due to infinity cache?
 
Am I correct in saying that Nvidia GPUs are bandwidth starved but not AMD GPUs due to infinity cache?
Every GPU benefits from more memory bandwidth.
 
One thing to note: looking at hwBot, an overclocked 4080 can get over 10000 points, which means most of the preformance increase came from the core oc and not from the memory replacement.
 
i mean you can overclock the memory asw
just ymmv since siliconlottery™ will dictate how much you can oc the memory, if you replace those chips w/ better bins ofc you can also oc them more
 
Every GPU benefits from more memory bandwidth.
That's not how I understand things unless you mean compared within a certain architecture. For instance, the Radeon VII had a bandwidth of 1024 GBps with 3870 cores. The Radeon 7800xt has a bandwidth of 624 GBps with 3870 cores. The 7800xt is much much faster than the Radeon VII. Giving the Radeon VII even more bandwidth probably would not have affected performance.
 
would be interesting to see 4090 fitted with 26gbps GDDR6X
 
That's not how I understand things unless you mean compared within a certain architecture.
No, I mean literally all GPU architectures, memory bandwidth is always a constraint on performance, some architectures are more efficient at utilizing available memory bandwidth than others.

Giving the Radeon VII even more bandwidth probably would not have affected performance.
It probably would have, think how much more faster caches are than VRAM, that means there is always more performance to be had if you improve memory bandwidth. You get diminishing returns but the % improvement would be non zero.
 
One thing to note: looking at hwBot, an overclocked 4080 can get over 10000 points, which means most of the preformance increase came from the core oc and not from the memory replacement.
Yeah memory OC is always shaky. At some point you do get the frequency but the error count kills performance regardless.

No, I mean literally all GPU architectures, memory bandwidth is always a constraint on performance, some architectures are more efficient at utilizing available memory bandwidth than others.
That statement feels way to general to be true. It depends entirely on what you're feeding it.
 
Considering both graphics cards use the same processor, when one gets faster memory and is overclocked (yet uses stock cooling) I would expect it to be faster. Like someone wrote on videocardz.com, where I first read about this, if anything it proves nVidia deliberately held performance back.
 
That statement feels way to general to be true. It depends entirely on what you're feeding it.
To a point but I agree with him. Is it going to matter to legos adventures? Probably not. But if you are benchmarking or playing anything demanding it really does. Especially if you start using lots of vram.
 
The big story here is that with faster memory, the RTX 4070 Ti SUPER beats the RTX 4080, despite 13% fewer shaders and other key components.
Is it? The 3GHz on the screenshot should be ~10% faster clock than a 4080 which would make up a very large part of the shader deficit. This at the same time with the main headline of memory bandwidth increase of 15-25% and I would be rather disappointed if it did not beat a stock 4080 :)

That's not how I understand things unless you mean compared within a certain architecture. For instance, the Radeon VII had a bandwidth of 1024 GBps with 3870 cores. The Radeon 7800xt has a bandwidth of 624 GBps with 3870 cores. The 7800xt is much much faster than the Radeon VII. Giving the Radeon VII even more bandwidth probably would not have affected performance.
That is not a fair comparison and will not always play out like this in real world. 7800XT has the relatively huge 64MB of cache that heavily augments the lacking memory bandwidth. If cards run out of cache - which they tend to do at one point - the difference in memory bandwidth comes back to play its role.

This was the latest big change in the paradigm - AMD started with RDNA2 and Nvidia followed suit with Ada. Crappy for consumer but from technical point of view a nice huge efficiency boost.
 
Wish they would have posted the cost of the ram (labour excluded)

x$ for x%perf increase
 
Card beats the next tier with good OC, this has never happened in the history! /s
 
To a point but I agree with him. Is it going to matter to legos adventures? Probably not. But if you are benchmarking or playing anything demanding it really does. Especially if you start using lots of vram.
Benchmarking and playing something demanding are two different worlds. A lot of gpu benches are not very memory focused. They also work with a pretty limited set of assets. Heaven, Valley... Superposition. Non DX12 3DMark...
 
Considering both graphics cards use the same processor, when one gets faster memory and is overclocked (yet uses stock cooling) I would expect it to be faster. Like someone wrote on videocardz.com, where I first read about this, if anything it proves nVidia deliberately held performance back.
Or they didn't fancy adding another $300 on top of cards due to memory costs.
 
Benchmarking and playing something demanding are two different worlds. A lot of gpu benches are not very memory focused. They also work with a pretty limited set of assets. Heaven, Valley... Superposition. Non DX12 3DMark...
Benchmarks also like games aren’t created equal but I mean they clearly scored higher so I will respectfully disagree I guess
 
That statement feels way to general to be true
It has to be true. Why do you think GPUs have caches ? It's because the memory is always a bottleneck, if it wasn't there would be no need for caches.
 
Good ol' HW-modding is always interesting. Would be interesting to see could a 1080 Ti/2080 Ti be boosted to a 12GB/384-bit card by adding the missing memory chip etc.
 
Fun fact - RTX4080 was the first time in nvidia history that a next gen product had a memory bandwidth DECREASE from its predecessor. (except the GTX780 > GTX980)
I guess they claim that the extra cache makes up for it, similar to how they claimed the memory compression stuff made up for it with maxwell.
 
Fun fact - RTX4080 was the first time in nvidia history that a next gen product had a memory bandwidth DECREASE from its predecessor. (except the GTX780 > GTX980)
I guess they claim that the extra cache makes up for it, similar to how they claimed the memory compression stuff made up for it with maxwell.
Another fun fact - AMD did exactly the same thing a generation or two earlier with 6000 series.
 
Fun fact - RTX4080 was the first time in nvidia history that a next gen product had a memory bandwidth DECREASE from its predecessor. (except the GTX780 > GTX980)
I guess they claim that the extra cache makes up for it, similar to how they claimed the memory compression stuff made up for it with maxwell.
Which it did, and which it does, to a pretty reasonable extent.

I do agree Ada's bandwidth leaves a lot to be desired, but the cards seem balanced, if you don't look at the 12GB siblings apart from the x70.
 
a memory swapped heavily overclocked 4070ti super barely beat a 4080. In other news water is still wet.
 
Back
Top