• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

The nVidia memory bandwidth myth explained.

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,473 (4.01/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
Ok, I've seen a few people say that nVidia's cards have more memory bandwidth, and hence will perform better in applications that use more memory bandwidth. The reason behind this is the larger memory bus that the GTX400 series cards have. Well, this isn't really that true.

The easiest way to explain it is probably a pretty table:
Card | Memory Bus | Memory Clock | Memory Bandwidth
GTX480 | 384-bit | 924MHz | 177.4GB/s
GTX470 | 320-bit | 837MHz | 133.9GB/s
HD5870 | 256-bit | 1200MHz | 153.6GB/s
HD5850 | 256-bit | 1000MHz | 128.0GB/s

Yes, the GTX480 is on top in memory bandwidth, but the lead is not as big as some would expect. The reason? Well, nVidia is using insanely slow GDDR5 memory, almost like they are using last generation's memory on this generation's cards, and they pushed the memory bus up to make up for it. Of course the larger memory bus means more memory chips, more power consumption, more to keep stable, which naturally leads to lower clock speeds also. And it is more expensive to use higher clocked memory, and their cards are already costing too much to produce. Also, the memory on the ATi cards overclock better, again because there are less chips and they are using higher quality memory, which means even the HD5870 can be overclocked to surpass an overclocked GTX480 in memory bandwidth and an HD5850 can be overclocked to surpass an overclocked GTX470 in memory bandwidth.
 
Joined
Mar 2, 2009
Messages
5,061 (0.86/day)
Processor AMD Ryzen 5 7600
Motherboard Gigabyte B650 Aorus Elite AX
Cooling Thermalright Peerless Assassin 120 SE
Memory Kingston Fury Beast DDR5-5600 16GBx2
Video Card(s) Gigabyte Gaming OC AMD Radeon RX 7800 XT 16GB
Storage TEAMGROUP T-Force Z440 2TB, SPower A60 2TB, SPower A55 2TB, Seagate 4TBx2
Display(s) AOC 24G2 + Xitrix WFP-2415
Case Montech Air X
Audio Device(s) Realtek onboard
Power Supply Be Quiet! Pure Power 11 FM 750W 80+ Gold
Mouse Logitech G Pro X Superlight Wireless
Keyboard Royal Kludge RK-S98 Tri-Mode RGB Mechanical Keyboard
Software Windows 10
Would the difference in memory size matter as well or not? If it does, by how much?

GTX 480 - HD 5870
1536MB > 1024MB (There are 2048MB configurations but are not as common)

GTX 470 - HD 5850
1280MB > 1024MB (There are 2048MB configurations but are not as common)
 

Tatty_Two

Gone Fishing
Joined
Jan 18, 2006
Messages
26,029 (3.70/day)
Location
Worcestershire, UK
Processor Intel Core i9 11900KF @ -.080mV PL max @220w
Motherboard MSI MAG Z490 TOMAHAWK
Cooling DeepCool LS520SE Liquid + 3 Phanteks 140mm case fans
Memory 32GB (4 x 8GB SR) Patriot Viper Steel Bdie @ 3600Mhz CL14 1.45v Gear 1
Video Card(s) Asus Dual RTX 4070 OC + 8% PL
Storage WD Blue SN550 1TB M.2 NVME//Crucial MX500 500GB SSD (OS)
Display(s) AOC Q2781PQ 27 inch Ultra Slim 2560 x 1440 IPS
Case Phanteks Enthoo Pro M Windowed - Gunmetal
Audio Device(s) Onboard Realtek ALC1200/SPDIF to Sony AVR @ 5.1
Power Supply Seasonic CORE GM650w Gold Semi modular
Software Win 11 Home x64

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
28,613 (3.74/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
the memory chips are actually the same on both ati and nvidia. it's the memory controller inside the gpu and the signal routing on the pcb that makes the difference in memory clock
 
Joined
Mar 2, 2009
Messages
5,061 (0.86/day)
Processor AMD Ryzen 5 7600
Motherboard Gigabyte B650 Aorus Elite AX
Cooling Thermalright Peerless Assassin 120 SE
Memory Kingston Fury Beast DDR5-5600 16GBx2
Video Card(s) Gigabyte Gaming OC AMD Radeon RX 7800 XT 16GB
Storage TEAMGROUP T-Force Z440 2TB, SPower A60 2TB, SPower A55 2TB, Seagate 4TBx2
Display(s) AOC 24G2 + Xitrix WFP-2415
Case Montech Air X
Audio Device(s) Realtek onboard
Power Supply Be Quiet! Pure Power 11 FM 750W 80+ Gold
Mouse Logitech G Pro X Superlight Wireless
Keyboard Royal Kludge RK-S98 Tri-Mode RGB Mechanical Keyboard
Software Windows 10
the memory chips are actually the same on both ati and nvidia. it's the memory controller inside the gpu and the signal routing on the pcb that makes the difference in memory clock

So custom PCB's really makes a difference besides material(s) used and the length of the PCB?
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
28,613 (3.74/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
So custom PCB's really makes a difference besides material(s) used and the length of the PCB?

yes of course, if there was no difference then the ref design would be the cheapest solution possible, leaving no reason for custom pcb designs
 
Joined
Mar 4, 2006
Messages
448 (0.06/day)
Would the difference in memory size matter as well or not? If it does, by how much?
Not entirely sure, but I think it goes like this: As long as you have more memory than you're using, the memory with the highest bandwidth has the advantage. Once you're using more memory than you have, things have to be loaded to and from that memory and the bandwidth takes an extra hit because of this. In the second case, more, not faster memory has the advantage.

I honestly can't thing of any game (except GTA IV perhaps?) that requires more than 1024MB of memory at reasonable resolutions, but that's mostly because I have no clue how much any game uses. Maybe there's a tool for that? Could be interesting for reviews.
 

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,473 (4.01/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
Would the difference in memory size matter as well or not? If it does, by how much?

GTX 480 - HD 5870
1536MB > 1024MB (There are 2048MB configurations but are not as common)

GTX 470 - HD 5850
1280MB > 1024MB (There are 2048MB configurations but are not as common)

I think from what we've seen with W1z's reviews of the 2GB ASUS HD 5870, anything over 1GB makes next to no difference in current games, even at 2560x1600. We might see games in the future that show a difference, but I'm not counting on anything coming down the line any time soon.

the memory chips are actually the same on both ati and nvidia. it's the memory controller inside the gpu and the signal routing on the pcb that makes the difference in memory clock

Maybe on the GTX480, as they do use 1250MHz chips, but the GTX470 uses 1000MHz chips while even the HD5850 uses 1250MHz chips.

The memory controller does also play a big part in the clock speeds as well, I forgot to mention that, but it is very true, and from my understanding a larger bus makes the memory controller/s work harder also, which leads to lower clock speeds. It is similar to how a motherboard in single channel can clock the memory higher then when in dual channel, but in the end the single channel still has less memory bandwidth.
 
Joined
Apr 10, 2010
Messages
1,880 (0.34/day)
Location
London
System Name Jaspe
Processor Ryzen 1500X
Motherboard Asus ROG Strix X370-F Gaming
Cooling Stock
Memory 16Gb Corsair 3000mhz
Video Card(s) EVGA GTS 450
Storage Crucial M500
Display(s) Philips 1080 24'
Case NZXT
Audio Device(s) Onboard
Power Supply Enermax 425W
Software Windows 10 Pro
Not entirely sure, but I think it goes like this: As long as you have more memory than you're using, the memory with the highest bandwidth has the advantage. Once you're using more memory than you have, things have to be loaded to and from that memory and the bandwidth takes an extra hit because of this. In the second case, more, not faster memory has the advantage.

I honestly can't thing of any game (except GTA IV perhaps?) that requires more than 1024MB of memory at reasonable resolutions, but that's mostly because I have no clue how much any game uses. Maybe there's a tool for that? Could be interesting for reviews.

GPU-Z and MSI Afterburner show how much memory a video card is using; apply 4xAA or 8xAA on some games and you'll see. I've seen Stalker CoP using 1200MB and Flatout: Ultimate Carnage with 32xAA + 8x supersampling using 800MB :)
 
Joined
May 16, 2008
Messages
1,258 (0.20/day)
Location
North Carolina
yes of course, if there was no difference then the ref design would be the cheapest solution possible, leaving no reason for custom pcb designs
So the reference design isn't the cheapest? Does that mean that custom PCBs (which you usually pay more for) are cheaper than the reference design?

If the above is true, why not make the reference design better and/or cheaper? I'm confused


Unrelated question: if memory chips are spec'ed at a certain frequency, why run them slower than that?
 

Wrigleyvillain

PTFO or GTFO
Joined
Oct 13, 2007
Messages
7,702 (1.20/day)
Location
Chicago
System Name DarkStar
Processor i5 3570K 4.4Ghz
Motherboard Asrock Z77 Extreme 3
Cooling Apogee HD White/XSPC Razer blocks
Memory 8GB Samsung Green 1600
Video Card(s) 2 x GTX 670 4GB
Storage 2 x 120GB Samsung 830
Display(s) 27" QNIX
Case Enthoo Pro
Power Supply Seasonic Platinum 760
Mouse Steelseries Sensei
Keyboard Ducky Pro MX Black
Software Windows 8.1 x64
So the reference design isn't the cheapest? Does that mean that custom PCBs (which you usually pay more for) are cheaper than the reference design?

If the above is true, why not make the reference design better and/or cheaper? I'm confused


Unrelated question: if memory chips are spec'ed at a certain frequency, why run them slower than that?

Well there may be other details but in the case of 5850s at least the non-reference models have a cheaper voltage regulator (that does not support software voltage adjustment. That's a feature and features cost money). May be cheaper to design and produce their own coolers as well but I am speculating there...
 

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,473 (4.01/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
So the reference design isn't the cheapest? Does that mean that custom PCBs (which you usually pay more for) are cheaper than the reference design?

If the above is true, why not make the reference design better and/or cheaper? I'm confused

The way I see it, there are two reasons to use a custom PCB.

1.) Use cheaper parts and a cheaper PCB layout to reduce costs.
2.) Use beefier parts and a better PCB layout to make the card better.

A good example of option 1 is most of the PCBs use in the HD5830 cards, where the manufacturers cheaped out to make cheaper cards and maximize profits. The results were cards that consumed more power, despite being slower, then HD5850 reference cards. Another example would be Powercolor PCS+ HD5850, where the components were cheaper, like a cheaper voltage controller that doesn't allow voltage control.

A good example of option 2 is the ASUS HD5870 Matrix Platinum, where everything on the PCB was beefed way the hell up.

Unrelated question: if memory chips are spec'ed at a certain frequency, why run them slower than that?

Well, one big reason is the limitations the memory controller has, as it can only handle such a big memory bus at certain speeds, as W1z mentioned. Then there is also the possibility that the memory is running at below specs voltages, to help save power consumption(which the GTX400 cards need every bit of help they can get). I'm sure there are probably other reasons, those are just two that come to mind.
 

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.47/day)
Location
Reaching your left retina.
GPU-Z and MSI Afterburner show how much memory a video card is using; apply 4xAA or 8xAA on some games and you'll see. I've seen Stalker CoP using 1200MB and Flatout: Ultimate Carnage with 32xAA + 8x supersampling using 800MB :)

Yeah, when there's enough memory, a game will take as much as it can, but that doesn't mean that it needs all that much, not when we are talking about 1024 MB and above. Most of that memory amount comes from textures and game data and not always that data is recent or will need to be reused anytime soon. Sometimes freeing up the memory may actually be less technically efficient than leaving it there.

Also loading something (i.e a texture) to memory takes orders of magnitudes less time than the time in which the texture will be in use. So even if data must be loaded for almost every frame, performance will not be degraded more than 1-5%. In the past the output buffers occupied a very big part of the vram, I'm talking about the days when we had 128-256 MB, and hence AA and high resolution could make video cards go to a crawl. With 512 MB the effects started to be smaller, because the size of the output buffers have not changed that much really and with more than 1 GB it starts to be almost negligible, especially considering that newer techniques allow for a more efficient use of the memory.

In that regards I'm really curious about what's going to happen in the future with Carmack's improved megatextures and the octree data representation he talks about (if he finally implements it on id tech 5/6). They can potentially make even 1GB of vram overkill.

Well, one big reason is the limitations the memory controller has, as it can only handle such a big memory bus at certain speeds, as W1z mentioned. Then there is also the possibility that the memory is running at below specs voltages, to help save power consumption(which the GTX400 cards need every bit of help they can get). I'm sure there are probably other reasons, those are just two that come to mind.

ECC memory comes to mind too. Aside from the fact that adding ECC support probably made the MC slower to boot, ECC memory is usually much slower and they might not want to see their consumer GPUs crushing their Tesla cards in those CUDA/OpenCL programs that both Tesla and GeForces will be able to run. If memory bandwidth was so much greater in GeForces (i.e 4800 Mhx vs 3200 Mhz) some/many CUDA apps would certainly run much much much better on the GPUs instead of on the Tesla cards. AMD and Intel have always done the same with their professional grade CPUs, leave the fastest dies for their Xeon and Opteron lines so that they look superior and can charge more. On stock they usually come in slower SKUs, but they almos invariably OC further and with less voltage than their consumer counterparts.
 
Last edited:
W

wahdangun

Guest
Yeah, when there's enough memory, a game will take as much as it can, but that doesn't mean that it needs all that much, not when we are talking about 1024 MB and above. Most of that memory amount comes from textures and game data and not always that data is recent or will need to be reused anytime soon. Sometimes freeing up the memory may actually be less technically efficient than leaving it there.

Also loading something (i.e a texture) to memory takes orders of magnitudes less time than the time in which the texture will be in use. So even if data must be loaded for almost every frame, performance will not be degraded more than 1-5%. In the past the output buffers occupied a very big part of the vram, I'm talking about the days when we had 128-256 MB, and hence AA and high resolution could make video cards go to a crawl. With 512 MB the effects started to be smaller, because the size of the output buffers have not changed that much really and with more than 1 GB it starts to be almost negligible, especially considering that newer techniques allow for a more efficient use of the memory.

In that regards I'm really curious about what's going to happen in the future with Carmack's improved megatextures and the octree data representation he talks about (if he finally implements it on id tech 5/6). They can potentially make even 1GB of vram overkill.



ECC memory comes to mind too. Aside from the fact that adding ECC support probably made the MC slower to boot, ECC memory is usually much slower and they might not want to see their consumer GPUs crushing their Tesla cards in those CUDA/OpenCL programs that both Tesla and GeForces will be able to run. If memory bandwidth was so much greater in GeForces (i.e 4800 Mhx vs 3200 Mhz) some/many CUDA apps would certainly run much much much better on the GPUs instead of on the Tesla cards. AMD and Intel have always done the same with their professional grade CPUs, leave the fastest dies for their Xeon and Opteron lines so that they look superior and can charge more. On stock they usually come in slower SKUs, but they almos invariably OC further and with less voltage than their consumer counterparts.


hmm i think GTX 4XX don't use ECC, because they said it will make G-force slower,

and btw have anyone see GT465 benchmark ?

it have more memory bandwidth, more SP than GTX 275 but the performance was same:ohwell:, do anyone know how that thing happen ?
 

sneekypeet

not-so supermod
Staff member
Joined
Apr 12, 2006
Messages
29,709 (4.27/day)
System Name EVA-01
Processor Intel i7 13700K
Motherboard Asus ROG Maximus Z690 HERO EVA Edition
Cooling ASUS ROG Ryujin III 360 with Noctua Industrial Fans
Memory PAtriot Viper Elite RGB 96GB @ 6000MHz.
Video Card(s) Asus ROG Strix GeForce RTX 3090 24GB OC EVA Edition
Storage Addlink S95 M.2 PCIe GEN 4x4 2TB
Display(s) Asus ROG SWIFT OLED PG42UQ
Case Thermaltake Core P3 TG
Audio Device(s) Realtek on board > Sony Receiver > Cerwin Vegas
Power Supply be quiet DARK POWER PRO 12 1500W
Mouse ROG STRIX Impact Electro Punk
Keyboard ROG STRIX Scope TKL Electro Punk
Software Windows 11

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.47/day)
Location
Reaching your left retina.
hmm i think GTX 4XX don't use ECC, because they said it will make G-force slower,

and btw have anyone see GT465 benchmark ?

it have more memory bandwidth, more SP than GTX 275 but the performance was same:ohwell:, do anyone know how that thing happen ?

It doesn't indeed, but the memory controler does have ECC and ECC support is not a magical checkbox somewhere. It's actual transistors implemented in actual silicon and it's imposible to include more transistors between point A and point B without adding latencies or potentially making the whole thing slower, unstable or more sensitive to clock change.

But apart from that I was speaking of them deliberately crippling memory bandwidth on desktop cards so that they are not much faster than Tesla cards in certain CUDA applications. It just wouldn't make for a good marketing of the Tesla's and it's there where they will be making most money out of GF100. Later GF104, 106 and 108 and maybe even a GF102 will be released without all the extra things that GF100 has included solely for CUDA/OpenGL. That would potentially eliminate all the obstacles that they found with GF100 and allow much faster clocks.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
28,613 (3.74/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
Itit's imposible to include more transistors between point A and point B without adding latencies or potentially making the whole thing slower

actually you can do exactly that with more transistors, basically do things in parallel
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,657 (7.43/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
ECC memory comes to mind too. Aside from the fact that adding ECC support probably made the MC slower to boot, ECC memory is usually much slower and they might not want to see their consumer GPUs crushing their Tesla cards in those CUDA/OpenCL programs that both Tesla and GeForces will be able to run.

Afaik, ECC is enabled only on Tesla; not GeForce and Quadro.
 

WSP

New Member
Joined
Aug 26, 2009
Messages
101 (0.02/day)
Location
Indonesia
Processor AMD PII X4 950BE | E5300
Motherboard MSI 770-G45 | Asus P5N32-E SLI bios 1903
Cooling Thermalright Ultra 120 eXtreme (mod) | stock
Memory 2x1Gb V-GEN PC10600 | 2gb kit Mushkin EM silverline
Video Card(s) 9800GT+9600GT | GTX260+GTS250
Storage Hitachi 160GB+Seagate 320GB | WD 160GB+Hitachi 160GB RAID 0
Display(s) Acer X193HQ | LG 710S
Case Enlight | caseless
Audio Device(s) Audigy SB0090 w/ ATP3 | no sound
Power Supply Toughpower 650W | VenomRX Boomslang modular 700W
Software Win7 x64
Benchmark Scores 14001 3DMark06 score with Manli 9600GT OCed 775-1175
so, basically, all Radeon HD5000 and Geforce GTX400 have IMC embedded onto the GPU, just like today's CPU?
if so, I see ATI doing Intel-like and NVIDIA doing AMD-like with their IMC.Intel's IMC usually leads to higher memory overclock. we see more than 2000mhz is common with Intel, whereas AMD can't compete in memory clockspeed.

that is completely beyond my senses.nvidia had a long time to prepare GF100 and they only came out with lowspeed IMC.cmiiw
 

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.47/day)
Location
Reaching your left retina.
actually you can do exactly that with more transistors, basically do things in parallel

You are suggesting implementing twice the MC number? One for ECC and one for non-ECC? I don't understand what you mean, unless you are talking in general and not about this particular case. If you're talking in general, I agree, to an extent, but if you're not I can't agree/disagree unless I understand what you mean. :)
If you are saying what I think you are saying, I'm not sure that would help making fermi faster at all, adding more silicon etc.

For clarification, maybe I worded wrongly but with A/B I meant input and output of the machine, in this case the MC. Adding something in parallel is not erxactly adding something between A and B in the context I was speaking. It's introducing two machines, one from a C to D and one from E to F, both of which would go to a switch or something and it's the switch that has access to A and B. I hope it's clear.


It doesn't indeed, but the memory controler does have ECC and ECC support is not a magical checkbox somewhere. It's actual transistors implemented in actual silicon and it's imposible to include more transistors between point A and point B without adding latencies or potentially making the whole thing slower, unstable or more sensitive to clock change.
 
Last edited by a moderator:
Joined
Jan 14, 2009
Messages
2,644 (0.44/day)
Location
...
System Name MRCOMP!
Processor 5800X3D
Motherboard MSI Gaming Plus
Cooling Corsair 280 AIO
Memory 64GB 3600mhz
Video Card(s) GTX3060
Storage 1TB SSD
Display(s) Samsung Neo
Case No Case... just sitting on cardboard :D
Power Supply Antec 650w
the Geforce series dosnt have ECC... etleast its not turned on.



The latter is the most interesting, as under normal circumstances implementing ECC requires a wider bus and additional memory chips. The GTX 400 series will not be using ECC, but we went ahead and asked NVIDIA how ECC will work on Fermi products anyhow.



The short answer is that when NVIDIA wants to enable ECC they can just allocate RAM for the storage of ECC data. When ECC is enabled the available RAM will be reduced by 1/8th (to account for the 9th ECC bit) and then ECC data will be distributed among the RAM using that reserved space. This allows NVIDIA to implement ECC without the need for additional memory channels, at the cost of some RAM and some performance.


From WIKI
http://en.wikipedia.org/wiki/GeForce_400_Series

While the Fermi architecture includes support for the ECC feature on chip,[17] there is no option to enable ECC on GeForce GTX 470 and 480 cards.


yes the MC would most likely have the ECC transisters inside it, but there turned off and would have no effect on the bandwidth.
 
Last edited:

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.47/day)
Location
Reaching your left retina.
You are not getting what I'm saying guys. It's not bandwidth which is damaged. It's the fact that adding more transistors and more traces, even if they are not active, will make the travel from A to B longer, that's unavoidable, and that can potentially make the whole machine sitting between A and B slower. Light travels fast, but not at all when we are talking about such small distances and Mhz. It's probably the most essential concern in chip designing.


And next time don't use that tone with me. Reminding me of my position, etc.

First of all I think you missinterpreted the tone.

You know who I am and I'm sure you remember my story in the "what's wrong with our forums" thread. It was made very clear for me that mods are like any other members. Now, I've seen many mods, you included, verbally punisihng people for doing exactly what you did, post unnecessarily before reading.

If you want to punish me, do it, but please don't use the power of that position you don't want me to mention to threaten me.
 
Last edited:
Joined
Jan 14, 2009
Messages
2,644 (0.44/day)
Location
...
System Name MRCOMP!
Processor 5800X3D
Motherboard MSI Gaming Plus
Cooling Corsair 280 AIO
Memory 64GB 3600mhz
Video Card(s) GTX3060
Storage 1TB SSD
Display(s) Samsung Neo
Case No Case... just sitting on cardboard :D
Power Supply Antec 650w
i honestly doubt less then 1 MM will affect the latency enough to notice a change in performance.

its likely a distance of a few extra NM.. sure its better to be closer, but the difference will be less then a a 1 nano second.

take a look at ram for instance, its a good 10 CM away in terms of tracer distance, and it reaches 40NS latency.

also electricity in a wire dosnt flow as fast as light, 95% the speed of light would be a closer guess.
 
Last edited:

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
28,613 (3.74/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
You are not getting what I'm saying guys. It's not bandwidth which is damaged. It's the fact that adding more transistors and more traces, even if they are not active, will make the travel from A to B longer, that's unavoidable, and that can potentially make the whole machine sitting between A and B slower. Light travels fast, but not at all when we are talking about such small distances and Mhz. It's probably the most essential concern in chip designing.

c = 3*10^5 km/s = 3*10^11 mm/s = 300 mm/nanosecond.

So to travel the 100 mm to the memory the signal needs 0.3 ns. where do you think the rest of the time is spent if not in the memory controller ?

even at 1 ghz clock speed the latency of a single request is not 1 ns, it's much much more
 
Joined
Jan 14, 2009
Messages
2,644 (0.44/day)
Location
...
System Name MRCOMP!
Processor 5800X3D
Motherboard MSI Gaming Plus
Cooling Corsair 280 AIO
Memory 64GB 3600mhz
Video Card(s) GTX3060
Storage 1TB SSD
Display(s) Samsung Neo
Case No Case... just sitting on cardboard :D
Power Supply Antec 650w
c = 3*10^5 km/s = 3*10^11 mm/s = 300 mm/nanosecond.

So to travel the 100 mm to the memory the signal needs 0.3 ns. where do you think the rest of the time is spent if not in the memory controller ?

Could you define what C is? :p

or is it just a letter representing the final number.
 
Top