• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Could they not add a second bank of cheaper older memory?

Joined
Feb 11, 2009
Messages
5,848 (0.99/day)
System Name Cyberline
Processor Intel Core i7 2600k -> 12600k
Motherboard Asus P8P67 LE Rev 3.0 -> Gigabyte Z690 Auros Elite DDR4
Cooling Tuniq Tower 120 -> Custom Watercoolingloop
Memory Corsair (4x2) 8gb 1600mhz -> Crucial (8x2) 16gb 3600mhz
Video Card(s) AMD RX480 -> RX7800XT
Storage Samsung 750 Evo 250gb SSD + WD 1tb x 2 + WD 2tb -> 2tb MVMe SSD
Display(s) Philips 32inch LPF5605H (television) -> Dell S3220DGF
Case antec 600 -> Thermaltake Tenor HTCP case
Audio Device(s) Focusrite 2i4 (USB)
Power Supply Seasonic 620watt 80+ Platinum
Mouse Elecom EX-G
Keyboard Rapoo V700
Software Windows 10 Pro 64bit
Bit theoretical, but claims are made that memory (vram) is just expensive, which Im not buying but lets say it is.


Cant they make a card with like 6 gb of the latest fastest gddr6x or heck even HBM and then a second layer of much cheaper (and yes slower) gddr5 but like 10 gb of it?

Just so that if the first part is filled, it can shove it over to the other slower memory and take the faster needing stuff for its own account?


(kinda like how ram works with the cpu instead of the slower ssd/hdd)
 
Bit theoretical, but claims are made that memory (vram) is just expensive, which Im not buying but lets say it is.


Cant they make a card with like 6 gb of the latest fastest gddr6x or heck even HBM and then a second layer of much cheaper (and yes slower) gddr5 but like 10 gb of it?

Just so that if the first part is filled, it can shove it over to the other slower memory and take the faster needing stuff for its own account?


(kinda like how ram works with the cpu instead of the slower ssd/hdd)

I doubt this would be ideal and would still make the actual gpu die more complicated due to needing both GDDR5/6 memory controllers and a weird bus layout likely negating any savings using the slower memory.
 
GTX 970 says hello :clap:

Although he's talking about using two different kinds of vram... The 970 had 4GB of identical vram but only 3.5 had access to full bandwidth.
 
Although he's talking about using two different kinds of vram... The 970 had 4GB of identical vram but only 3.5 had access to full bandwidth.
I knew it was slower than the main 3.5GB assumed it was a different kind of vRAM just slower, not that I had one to care enough to look into it in detail but it was big news (probably more to the enthusiast community) at the time
 
a second layer of much cheaper (and yes slower) gddr5 but like 10 gb of it
You mean like the (in)famous 970? They can't use the same bus width for that slower mem, so it wouldn't come cheap!
Also the reason why they lost the case IIRC was the (less) cache.
 
I knew it was slower than the main 3.5GB assumed it was a different kind of vRAM just slower, not that I had one to care enough to look into it in detail but it was big news (probably more to the enthusiast community) at the time

Well they ended up having to pay 970 owners about 30 usd per card due to getting sued over it.... That is actually a pretty large amount for a class action suit.
 
VRAM costs basically nothing... the difference in manufacturing between the 8 and 16GB model is probably 10-20 dollars for a company like NVidia.
 
They'd get sued again...:laugh:
This precisely.

I believe Nvidia has officially said goodbye to asymmetrical bus nonsense, and using slower memory is really more of the same thing.


Here's the full 970 analysis. Really nice read

If you consider this pic, it really puts things in its place. That L2 cache is shared across not just part of the 3.5GB... but ALSO the 0.5GB memory module on its own - that one is sharing cache so it impairs both modules/MCs. So basically, the way they wired this, it even cripples 0.5GB of the 3.5GB that 'has full bandwidth'. You could say this GPU starts losing after 3GB allocation.

1684612629887.png
 
Last edited:
Bit theoretical, but claims are made that memory (vram) is just expensive, which Im not buying but lets say it is.


Cant they make a card with like 6 gb of the latest fastest gddr6x or heck even HBM and then a second layer of much cheaper (and yes slower) gddr5 but like 10 gb of it?

Just so that if the first part is filled, it can shove it over to the other slower memory and take the faster needing stuff for its own account?
Memory controller won't like that (to put it mildly).
There is a reason why you never see a MB that can operate on two different memory standards (like DDR2 + DDR3/DDR3 + DDR4) at the same time.

Making HBM + GDDR work like L2/L3 cache on single PCB - is beyond nightmare level of PCB trickery to make work (+ it would cost WAY too much, for it to make a profit with wider audience).
 
Bit theoretical, but claims are made that memory (vram) is just expensive, which Im not buying but lets say it is.


Cant they make a card with like 6 gb of the latest fastest gddr6x or heck even HBM and then a second layer of much cheaper (and yes slower) gddr5 but like 10 gb of it?

Just so that if the first part is filled, it can shove it over to the other slower memory and take the faster needing stuff for its own account?


(kinda like how ram works with the cpu instead of the slower ssd/hdd)
They could but I will be blunt, I think Nvidia have adopted a business plan of making their cards become obsolescent as fast as possible.

I still remember when they were interviewed by pcper in the maxwell era, and was asked how life is so easy having no decent competitor, their response was we are competing against ourselves, its a big job to convince owners of our older gen products to upgrade, I think that reply was actually quite honest.

They could do modular GPU's, add the ability to use VRAM of a cheaper card in another PCIE slot or something, but they dont. Instead its a case if you want more VRAM, you can buy an entirely replacement GPU with more rendering performance than you need.

There is also the market segmentation issue, they have people desperate for VRAM (who also have the pockets) buying the 4 figure priced SKUs now, think of incidents where you e.g. have amazon throwing old stock away all because they dont want to devalue the brand of a product. Its kind of like that, if they stick 16 gigs on a 4070 at $600, it would decimate sales of the 4080.
 
We'll soon see something like that for system memory in servers. Fast local memory in DIMM slots + slow, high latency, but expandable CXL memory over PCIe bus. Of course it won't "just work". Algorithms that determine what goes where will have to be very smart in order to utilise each type of memory to the best of its capabilities, and prevent frequent movement of huge amounts of data between the two types.

Could a GPU's memory manager possibly be smart enough to handle a similar situation effectively?
 
Lotta 'If's but there was once a chance of seeing NAND/PCM on GPUs, deployed *very* similarly to how OP proposed.

If I had silly amounts of money to throw around, I'd have already imported a used Pro SSG from the UK, and stuck 4 Optane drives in it. Just to have an Artifact of What Could Have Been
 
Lotta 'If's but there was once a chance of seeing NAND/PCM on GPUs, deployed *very* similarly to how OP proposed.
I've thought of that too. Very much doable on a consumer card, on a more modest scale. AMD could integrate a 4-channel SMI SSD controller with custom firmware on the PCB, add 250 GB of TLC flash, and make it operate in permanent pseudo-SLC mode to get 80 GB of high endurance memory, getting maybe close to 6-7 GB/s speeds.
 
I've thought of that too. Very much doable on a consumer card, on a more modest scale. AMD could integrate a 4-channel SMI SSD controller with custom firmware on the PCB, add 250 GB of TLC flash, and make it operate in permanent pseudo-SLC mode to get 80 GB of high endurance memory, getting maybe close to 6-7 GB/s speeds.
I believe AMD used one of their Xilinx FPGAs to accomplish similar with 4x (striped) Samsung Gen3 NVMe M.2 Modules.
(it appears common for companies to deploy developing technologies using 'off-the-shelf' parts first, prior to integration.)

Normally, 'storage devices' can't be 'used as "memory"'; whatever magic programming they loaded into their FPGA allowed the NVMe drives to be addressed as 'extended VRAM'.

I think it's 'Related IP' to HBCC, but there were never any driver-side implementations beyond using System RAM as extended VRAM.

HBCC almost was what @ZoneDymo was 'getting at'.
HBCC Memory Segment – High Bandwidth Cache Controller (HBCC) Memory Segment allows allocation of system memory to the graphics card. This can be useful in applications that require more video memory than what is available on the graphics card.
Set HBCC Memory Segment to Enabled the drag the HBCC Memory Size slider right or left to increase or decrease the total system memory that is allocated to the HBCC. Click OK to confirm.

To reset the system memory allocation back to default settings, click Perform Reset and OK to confirm.

Below is a screenshot example of these options:

DH2-012-20_0.png
I've never seen such a configuration, but:
If someone could 'fake' a storage volume into 'extended system memory', you could (effectively) use HBCC to DIY-a-ProSSG w/ an MI25, WX9100, Vega Frontier, etc.
(edit: I supposed allocating an entire storage volume to Page File/Virtual Memory, then allocating Maximum (real)RAM to HBCC, would be semi-equivalent. However, I'd expect issues to arise; if only increased overall system latency)



Edit:
Oh, and since I don't think it'd been mentioned yet...
DirectStorage, is almost a 'standardized software implementation' of the Thread's Topic.
Gen5 SSDs are going to be (at least in raw bandwidth), 'up there with' older DRAM and GraphicsDRAM.
 
Last edited:
DirectStorage, is almost a 'standardized software implementation' of the Thread's Topic.
Yeah, especially as one of its features is decompressing from memory (even better *if* that also means from system RAM to VRAM).
NAND flash can't just serve as "RAM extension", it may in cases when you have a lot of slow-changing data, I don't know if games do.
 
Sacrificing bandwidth for capacity in bandwidth sensitive application sounds counterproductive to me.
And I agree with agent_x007 said. Hardware solutions that add unnecessary, and costly, complexity should generally be avoided.

They could do modular GPU's, add the ability to use VRAM of a cheaper card in another PCIE slot or something, but they dont.

Because it doesn't make sense.
PCIe bandwidth, at its best, is a fraction of even a midrange GPU's memory bandwidth. The-yet-to-go-mainstream PCIe6 has 121GBps bandwidth at x16. An old GTX2060 has 336GBps memory bandwidth.
And don't forget that you'd need two slots, which is even rarer to have both on x16 mode. The typical high end "gaming" motherboard is probably still stuck at PCIe5 x8/x8 config for dual slot usage. At those speeds, even using the system's memory may be bottlenecked.
 
Back
Top