• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Nvidia seems to always have their own memory standard, why not everyone else?

Thats basically what I said.

The X variant is a something different that entices people to want the product. If they just used basic GDDR, there would be less of a draw and as such lower prices.

The HBM failed experiment was a bad move on AMD part trying to do same, but luckily they reverted to basic GDDR after.
Its not just different - the point I was making is that Nvidia gambled on NOT fixing increased demand on their memory system with expensive HBM that was also hard to source and gpus on it harder to produce. Basic GDDR5 simply wasnt an option to cover the performance offered by anything as fast or faster than Vega. Both AMD and Nvidia saw this. Big Cache apparently wasnt an option either.
 
What the difference between GDDR6 and GDDR6X?
- One letter more.
PAM4 signaling.

Let's compare the 7800 XT and the 4070 Ti. The former uses a 256-bit bus with 19.5 Gbps GDDR6, resulting in 624.1 GB/s total bandwidth. The latter uses a 192-bit bus with 21 Gbps GDDR6X, resulting in 504.2 GB/s. To imagine the same results with GDDR6 (non-X), Nvidia would have to use a 256-bit bus which results in a larger GPU die and more VRAM chips on a more complex PCB. GDDR6X saves costs on these fronts.
You are right and you are making a really good point. For some reason I did not really look at the bandwidth similarity in this case.

As a side note, this only makes me want to see someone do a real head-to-head with 4070Ti and 7800XT. 60CU/SM, same memory bandwidth - this is as close a comparison of RDNA3 vs Ada as we can get. Forcing both to run at same/similar clock speed would give a cool comparison point. I bet RDNA3 and Ada are pretty even in performance architecturally.
 
I wouldn't assume that GDDR6X chips are freely available to non-Nvidia corporations. They may have NV patents baked in, possibly related to PAM4 encoding, link training, or etcetera.
 
Thanks Raja for that stunt and his professional leadership of the development process. He had all the resources to foresee a bad product. He was hired to have the ability to foresee it just looking at the dev drawing. It ain't the memory at fault here.

GDDR6 or X or not... are GPU's that memory speed starved? Definitely not.

I don't agree with blaming Raja Koduri like a boogeyman. Reddit made this a trope amidst AMD fans, and they love to pin it all on him and pretend AMD has no problems anymore ever since he left. If there was one blameless fellow in their midst, that was Raja. With what he could work - GCN - it couldn't have panned better than it did. Polaris represents AMD's most successful architecture in their history, and Fiji/Vega represent AMD's foray into high performance GPU computing, the foundation of which is still used in CDNA 2 today. I consider his work at AMD a complete success.

The things that AMD did wrong, they still do wrong and they insist on doing wrong, with no outlook of improvement in sight.

HBM was made with AMD collab and was used first on Fiji. So, NV simply couldn't get it before Fiji launched (IF they REALLY wanted it). Aside from that, in 2015, HBM 1.0 was probably too much hassle to make work for NV, on top of being very capacity limited AND not really needed it at that point (since HBM didn't help AMD beat Maxwell 2.0).
Vega 10 was made with HBM2 in mind, and you can't simply switch memory tech mid way because "it's too expensive to implement"). Also, Vega 10 needed HBM2 to not blow past power budget too fast (as every MHz was needed to counter Pascal price/performance metric).

Titan V was released in December 2017, and it was at least 50% cheaper than previous HBM2 GPU (Quadro GP100) :D (/s)
And it is faster than Vega 20 which launched a bit over year later (even with one HBM2 stack disabled vs. Vega 20).

The initial version of HBM was indeed developed in a joint venture by AMD and Hynix, but that doesn't mean much, just like the development of X variants for G5 and G6. The Quadro GP100 was released in October 2016, so a year after Fiji but a year before Volta, but I believe they were quite rare.
 
PAM4 signaling.

You are right and you are making a really good point. For some reason I did not really look at the bandwidth similarity in this case.

As a side note, this only makes me want to see someone do a real head-to-head with 4070Ti and 7800XT. 60CU/SM, same memory bandwidth - this is as close a comparison of RDNA3 vs Ada as we can get. Forcing both to run at same/similar clock speed would give a cool comparison point. I bet RDNA3 and Ada are pretty even in performance architecturally.
It's not exactly the same bandwidth. 624 vs 502 GB/s (even with GDDR6X), but I get your point.
 
I don't agree with blaming Raja Koduri like a boogeyman.
I mean he was the guy in charge, GCN was clearly compute oriented and stayed that way throughout all it's lifespan, as it was used as a basis for CDNA, an architecture literally designed to be compute only. They could have made Vega much faster at raster by simply adding more ROPs and restructuring the CUs for example, they choose not to, it's only natural to assume at the very least most of those choices were his doing.
 
I mean he was the guy in charge, GCN was clearly compute oriented and stayed that way throughout all it's lifespan, as it was used as a basis for CDNA, an architecture literally designed to be compute only. They could have made Vega much faster at raster by simply adding more ROPs and restructuring the CUs for example, they choose not to, it's only natural to assume at the very least most of those choices were his doing.

I don't think it's that simple, you can't just add more ROPs and expect that to fix the other issues of the architecture.
 
I don't think it's that simple, you can't just add more ROPs and expect that to fix the other issues of the architecture.

It actually is. As we see cut down GPU's as different SKU's and how they perform. The tough part is making that into silicon and RD costs of making such specific one. IE it is planning. But the math remains simple.

After Vega Launch and directors meeting, he packed his stuff and went to sabbatical you get the idea? He basically admitted it.

I don't agree that memory speed actually matters that much. It has always been that there is an architectural sweet spot, were further gains are minimal... basically if you design with that mind, there penalty of using slower memory, yet cheaper and more available is counterweights the gains of speed. It all about balancing.

What you don't do is make a product yet faster memory, claim it as it can cope with the fact it ain't enough because of caching etc... that was a pretty low bar...
 
I don't think it's that simple, you can't just add more ROPs and expect that to fix the other issues of the architecture.
Vega clearly had the shading power, something else was the bottleneck in the pipeline and that was likely the amounts of ROPs. Both 1080 and Vega 64 had 64 ROPs, it was clear Vega was never going to be faster than that.

RDNA1 also had 64 ROPs and what do you know it was still not much faster than Vega, then RDNA2 came along with a CU not much different from RDNA1 but now double the ROPs and boom, up to 2.5X faster than Vega.
 
Vega clearly had the shading power, something else was the bottleneck in the pipeline and that was likely the amounts of ROPs. Both 1080 and Vega 64 had 64 ROPs, it was clear Vega was never going to be faster than that.

RDNA1 also had 64 ROPs and what do you know it was still not much faster than Vega, then RDNA2 came along with a CU not much different from RDNA1 but now double the ROPs and boom, up to 2.5X faster than Vega.

Having strong raw compute doesn't always equal that a GPU will excel for graphics, though. I suppose the biggest example of that would be Kepler.

It actually is. As we see cut down GPU's as different SKU's and how they perform. The tough part is making that into silicon and RD costs of making such specific one. IE it is planning. But the math remains simple.

After Vega Launch and directors meeting, he packed his stuff and went to sabbatical you get the idea? He basically admitted it.

I don't agree that memory speed actually matters that much. It has always been that there is an architectural sweet spot, were further gains are minimal... basically if you design with that mind, there penalty of using slower memory, yet cheaper and more available is counterweights the gains of speed. It all about balancing.

What you don't do is make a product yet faster memory, claim it as it can cope with the fact it ain't enough because of caching etc... that was a pretty low bar...

And what would be the implications of that on yield, die size, performance balance, etc.? There is a sweet spot for memory, but the issue wasn't the use of HBM (Vega 10 only had two stacks of HBM and honestly 484 GB/s wasn't exactly groundbreaking even at the time, it was on the high end, sure, but not anything unheard of thanks to GP102)

In retrospective, Raja was poached at an early stage because Intel had dGPU plans. The two years or so before that happened was just enough time to get Alchemist off the floorplan, and in the time since Intel has poached a lot of people in the graphics industry, and this includes a large amount of driver engineers and user experience folks from AMD.
 
Back
Top