AMD Big Navi GPU Features Infinity Cache?

Valantar · Oct 8, 2020

Vayra86 said:
Its just another update to GCN, a good one, I won't deny that... but its no different from Maxwell > Pascal for example, and everyone agrees that is not a totally new arch either. They moved bits around, etc.

Unless you want to argue that this
View attachment 171199

Is radically different from this

View attachment 171200

I don't think "new architecture" necessarily has to mean "we invented new functional blocks" - if that's the requirement, there has barely been a new GPU architecture since the introduction of unified shaders...

If we're going by those block diagrams - ignoring the fact that block diagrams are themselves extremely simplified representations of something far more complex, and assuming that they accurately represent the silicon layout - we see quite a few changes. Starting from the right, the L1 cache is now an L0 Vector cache (which begs the question of what is now L1, and where it is), the local data share is moved next to the texturing units rather than between the SPs, SPs and Vector Registers are in groups twice as large, the scheduler is dramatically shrunk, split up and distributed closer to the banks of SPs, the number of scalar units and registers is doubled, there are two entirely new caches in between the banks of SPs, also seemingly shared between the two CUs in the new Work Group Processor unit, and lastly there's no longer a branch & message unit in the diagram at all.

Sure, these look superficially similar, but expecting a complete ground-up redesign is unrealistic (there are only so many ways to make a GPU compatible with modern APIs, after all), and there are quite drastic changes even to the block layout here, let alone the actual makeup of the different parts of the diagram. These look the same only if you look from a distance and squint. Similar? Sure. But definitely not the same. I would think the change from Kepler to Maxwell is a much more fitting comparison than Maxwell to Pascal.

nguyen said:
I have to remind you that
Vega 64 vs GTX 1080
Vega10 vs GP104
495mm2 vs 314mm2
484GBps vs 320GBps
300W vs 180W TDP

And Vega64 still lost to GTX 1080. Yeah Pascal kinda devastated AMD for the past 4 years. 1080 Ti (and also Titan XP) still has no worthy competition from AMD. Ampere is here so that massive amount of Pascal owners can upgrade to .

That's true. But then you have
RX 5700 XT vs RTX 2070
Navi 10 vs TU106
251mm² vs 445mm²
448GBps vs. 448GBps
225W vs. 175W TDP

Of course this generation AMD has a node advantage, and the 5700 XT still loses out significantly in terms of efficiency in this comparison (though not at all if looking at versions of the same chip clocked more conservatively, like the 5600 XT, which beats every single RTX 20xx GPU in perf/W).

Ampere represents a significant density improvement for Nvidia, but it's nowhere near bringing them back to the advantage they had with Pascal vs. Vega.

Zach_01 · Oct 8, 2020

Vayra86 said:
Its just another update to GCN, a good one, I won't deny that... but its no different from Maxwell > Pascal for example, and everyone agrees that is not a totally new arch either. They moved bits around, etc.

Unless you want to argue that this
View attachment 171199

Is radically different from this

View attachment 171200

So basically a better than Turing to Ampere situation... with a Jensen x2 perf uplift, that is in reality x1.2

Ampere is looking good only because Turing was so bad, over Pascal. Perf and price wise.

nguyen · Oct 8, 2020

Valantar said:
That's true. But then you have
RX 5700 XT vs RTX 2070
Navi 10 vs TU106
251mm² vs 445mm²
448GBps vs. 448GBps
225W vs. 175W TDP

Of course this generation AMD has a node advantage, and the 5700 XT still loses out significantly in terms of efficiency in this comparison (though not at all if looking at versions of the same chip clocked more conservatively, like the 5600 XT, which beats every single RTX 20xx GPU in perf/W).

Ampere represents a significant density improvement for Nvidia, but it's nowhere near bringing them back to the advantage they had with Pascal vs. Vega.

5700XT doesn't have any RT/Tensor cores, that make comparison between 5700XT to 2070 a bit moot. 2070 is like a car with Turbo Charger that people just disable it because that would make it unfair to other non Turbo Charged car.
Here is a fair comparison: Crysis Remastered with vendor agnostic RT that can leverage RT cores on Turing

2070 Super is like 3x the performance of 5700XT when there is alot of RT effects there (2070 and 2070 Super is15% apart).

Right now against Ampere, the node advantage that Big Navi has is so tiny that it's not strange that Navi21 XT 530mm2 is competing against GA104 394mm2. Also Navi21 XT is a cut down version much like 3080. The full fat Navi21 XTX version will be reversed for the Pro version where AMD has better margin selling them.

Zach_01 · Oct 8, 2020

nguyen said:
Right now against Ampere, the node advantage that Big Navi has is so tiny that it's not strange that Navi21 XT 530mm2 is competing against GA104 394mm2.

Where is this info? Because I can say that Navi 21 536mm2 is competing against GA102 628mm2

Vayra86 · Oct 8, 2020

Zach_01 said:
So basically a better than Turing to Ampere situation... with a Jensen x2 perf uplift, that is in reality x1.2

Ampere is looking good only because Turing was so bad, over Pascal. Perf and price wise.

I agree on that completely actually

but I don't think that was the topic, was it?

The problem is however, that AMD has yet to even reach Turing's peak performance level, and not just by a few % either. You can afford 'a Turing' when you're ahead of the game, otherwise it just sets you back further. Let's be real about it: RDNA2 as it is dreamt to be, should've been here 1,5 year ago at the latest. The fact they're launching it now though is still good progress... like I said earlier. Time to market seems to have improved. If they can also get closer on absolute performance, I'll be cheering just as much as you.

The problem with Navi so far is that I have absolutely no reason for cautious optimism. AMD has been silent about it other than some vague percentages that really say as much as Ampere's very generously communicated 1.9x performance boost. As much as that number is far from credible.... why would this one suddenly be the truth? These claims have and will always be heavily inflated and best-case. Other than that, we do know AMD has severe limitations to work with, most notably on memory. Anyway... this has all been said before, but that's where I'm coming from here. Not an anti-AMD crusade... just realism and history of progress. I really want them to catch up, but I don't feel like the stars have aligned yet.

nguyen · Oct 8, 2020

Zach_01 said:
Where is this info? Because I can say that Navi 21 536mm2 is competing against GA102 628mm2

Yeah sure, maybe in AOTS benchmark.

Vayra86 · Oct 8, 2020

nguyen said:
5700XT doesn't have any RT/Tensor cores, that make comparison between 5700XT to 2070 a bit moot. 2070 is like a car with Turbo Charger that people just disable it because that would make it unfair to other non Turbo Charged car.
Here is a fair comparison: Crysis Remastered with vendor agnostic RT that can leverage RT cores on Turing

2070 Super is like 3x the performance of 5700XT when there is alot of RT effects there (2070 and 2070 Super is15% apart).

Right now against Ampere, the node advantage that Big Navi has is so tiny that it's not strange that Navi21 XT 530mm2 is competing against GA104 394mm2. Also Navi21 XT is a cut down version much like 3080. The full fat Navi21 XTX version will be reversed for the Pro version where AMD has better margin selling them.

Please remove that ugly fart of a benchmark video because Crysis Remastered runs like shit regardless of GPU. Its not doing you - or anyone else - any favors to use as comparison.

You're almost literally looking at a PS3 engine here. Single threaded.

Zach_01 · Oct 8, 2020

nguyen said:
Yeah sure, maybe in AOTS benchmark.

You still do not answer anything. Where exactly are basing the poor asumption that RDNA2 NAVI will compete only with GA104. Based on the 256bit bus?

Vayra86 said:
I agree on that completely actually but I don't think that was the topic, was it?

The problem is however, that AMD has yet to even reach Turing's peak performance level, and not just by a few % either. You can afford 'a Turing' when you're ahead of the game, otherwise it just sets you back further. Let's be real about it: RDNA2 as it is dreamt to be, should've been here 1,5 year ago at the latest. The fact they're launching it now though is still good progress... like I said earlier. Time to market seems to have improved. If they can also get closer on absolute performance, I'll be cheering just as much as you.

The problem with Navi so far is that I have absolutely no reason for cautious optimism. AMD has been silent about it other than some vague percentages that really say as much as Ampere's very generously communicated 1.9x performance boost. As much as that number is far from credible.... why would this one suddenly be the truth? These claims have and will always be heavily inflated and best-case. Other than that, we do know AMD has severe limitations to work with, most notably on memory. Anyway... this has all been said before, but that's where I'm coming from here. Not an anti-AMD crusade... just realism and history of progress. I really want them to catch up, but I don't feel like the stars have aligned yet.

Its the too much pessimist thoughts and the presented doomed future of AMD graphics devision, that is making me be part of this discussion.
My thoughts are different. I can almost see a repeat in "history" but not the one that negative-to-RDNA people see.
This could be a new ZEN case with RDNA1 to be the ZEN and RDNA2 to be ZEN2/3

I guess we will see in 20 days

InVasMani · Oct 8, 2020

Here's some of personal takes and thoughts surrounding RDNA2 from back on Wednesday, June 12th 2019.

"The increased R&D budget should help bolster AMD's graphics division for what comes after NAVI. The transition down to 7nm or 7nm+ for Nvidia will be a nice jump in performance though at the same time for them. What AMD has planned for what follows NAVI is somewhat critical. They can't let their foot off the gas and need to accelerate their plans a bit and be more aggressive.

AMD should probably aim for

3X more more instruction rate's over NAVI for it's successor
3X to 4X further lossless compression
increase ROP's from 64 to 80
improve the texture filter units by 0.5X
improve texture mapping units by 0.5X to 1.5X (allowing for a better ratio of TFU's to TMU's)
3 CU resource pooling
7nm+ or node shrink
more GDDR capacity hopefully I think by the time a successor arrives we could see more per chip GDDR6 capacity or a price reduction
higher clocked GDDR

Bottom line I think AMD should really try to be more aggressive and further optimize it's efficiency of it's design and hopefully bump up frequency as well a bit. I don't think they need more stream processors right now, but rather need to improve the overall efficiency as a whole further to get more out of them. They also should aim to do some things to offer a few more GPU sku's to consumers at different price targets. I tend to think if they do that as well they might be able to even cut down chips to offer some good 2X or even 3X dual/triple GPU's as well based on PCIE 4.0 which good be good. I think if they could make the ROPs scale from 44/64/80 it would work well for just that type of thing and allowing better yields and binning options for AMD to offer to consumers.

Those are my optimistic aggresive expectations of what AMD should try to aim towards for NAVI's successor if the R&D budget allows for it at least. They should really make some attempt to leap frog ahead a bit further especially as Nvidia will be shrinking down to a lower node for whatever comes after Turing or "SUPER" anyway since that sounds like more of a simply refresh and rebadge with a new bigger high end Super Titan sku added because what else would they name it instead 2080ti Super why!?!?"

Nvidia's GPU's are in general more granular in terms of workload management and thus power and efficiency. AMD needs to step it up more and it's not that AMD GPU's can't be efficient, but in order for a GPU like Vega 56/64 to compete with Nvidia's higher end and more diverse offers they have to stray more outside of power and efficiency so end up looking less efficient and more power hungry than they could be under more ideal circumstances with a better budget to design more complex and granular GPU's as Nvidia offers. It boils down to price segments and where they are marketed by both companies, but it's a more uphill battle for AMD given the R&D budget. The transition to 7nm was a smart call for AMD at least since it'll get cheaper over time along with yields and binning improvements. It should make for a easier transition to 7nm+ as well. Finer power gating would probably help out AMD a fair amount as well at improving TDP for load and idle and will become more important anyway at lower node sizes to reduce voltages and waste heat plus it's important for mobile which is a area for big room for growth for the company."

It was mostly just goal posts to aim toward, but should be intriguing to see where AMD went with the design obviously they had a bit of a basis with RDNA and by extension GNC and everything before them. Looking at it today 64/80 ROPs card seem plausible while 44ROPs is highly doubtful, but it was even then even with a mGPU scenario it would be rather unlikely still who knows maybe they surprise us. Though if they were doing a mGPU solution I'd think they incorporate the infinity fabric bridge that Radeon Pro workstation card came up with as well as the infinity cache that's rumored the combination would probably do a lot to reduce and minimize the latency micro-stutter matter.

Looking at it now I tend to think a 72 ROPs and possibly 88 ROPs card is plausible to consider. If they can carve out a 64/72/80 ROPs segmentation that would probably be decent having 4 cards in total. Given the amount of SKU's Nvidia often offers that could be good for AMD. The interesting part about a 88ROPs sku is they could bin those chips and saving them aside and hold out to utilize them with GGDR6X a little further down the road when pricing for that normalizes and/or capacity increases. If they do it that way with the ROPs I could see a 224-bit/256-bit/288-bit/320-bit memory bus options plausible with infinity cache balancing them out further.

To me it looks like they are sticking to a single chip solution so probably bolstered the ROPs a fair bit along with some of those other area's mentioned. I think the highest end SKU will end up with either 80 or 88 ROPs it might initially be 80ROPs with some premium 88ROPs SKU's being binned and tucked away for a rainy day perhaps though tough to say. What rabbits will AMD pull out of it's hat of mysteries!!? Who knows though certainly fun to speculate though. I do hope they took some of those bullet points into consideration.

Improving the compression would be big deal I think they've fallen behind in that area relative to Nvidia. Some of the other stuff I felt with offer some synchronizing benefits along with improved design performance and/or efficiency enhancements. I think I looked at VEGA and RNDA block design diagrams to get a basic idea and figure out just how they might take things steps further for RDNA2 based on the changes made between VEGA to RDNA and some of my own personal injections. To me it was quite obvious they were trailing Nvidia and needed to hopefully make a big push on RDNA2 given Nvidia is already going to have a nice performance aid from the chip die shrink.

I feel like over engineering RNDA2 is the only practical way AMD can claw it's way back ahead of Nvidia especially this GPU refresh round since they were already at 7nm though idk if RNDA2 will be 7nm+ or not which would help and be welcome naturally. Nvidia will likely have higher priority on GDDR6X as well for sometime. In a sense and with AMD probably knowing Nvidia probably end up with a higher priority on that newer memory type that lends some credibility to the possibility of a infinity cache to offset it especially if they combine it with a slightly wider memory bus width. To me a big key is how well they segment the different GPU SKU's. On the memory side there are different scenario's at play do they have some SKU's with more ROP's and a wider memory bus with GDDR6X or HBM2!? Is the infinity cache scaled depending on SKU and how big is it and is it availible for all SKU's? Lots of possibilities and what are they doing with compression I sure hope they are making inroads to improve that.

Vayra86 · Oct 8, 2020

Zach_01 said:
You still do not answer anything. Where exactly are basing the poor asumption that RDNA2 NAVI will compete only with GA104. Based on the 256bit bus?

Its the too much pessimist thoughts and the presented doomed future of AMD graphics devision, that is making me be part of this discussion.
My thoughts are different. I can almost see a repeat in "history" but not the one that negative-to-RDNA people see.
This could be a new ZEN case with RDNA1 to be the ZEN and RDNA2 to be ZEN2/3

I guess we will see in 20 days

We will indeed and yes... AMD's predicted doom is like the predicted demise of PC gaming. It never happens

nguyen · Oct 8, 2020

Zach_01 said:
You still do not answer anything. Where exactly are basing the poor asumption that RDNA2 NAVI will compete only with GA104. Based on the 256bit bus?

For the past decade AMD has never once produced a card that has less memory bandwidth but out-perform Nvidia's counterpart
HD 7970 vs GTX 680 (264GBps vs 192GBps)
R9 290X vs GTX 780 Ti (320GBps vs 336GBps)
Fury X vs 980 Ti (512GBps vs 336 GBps)
Vega64 vs 1080 Ti (484GBps vs 484GBps)
RadeonVII vs 2080Ti (1024GBps vs 616GBps)
5700XT vs 2080 (448GBps vs 448GBps)

And now you think AMD can just make a card with 448GBps bandwidth that can compete with a 760GBps card from Nvidia. Keep on dreaming buddy, or play AoTS.

AMD was really hoping Nvidia would name the GA104 as the 3080 just like they did the 2080, but nope Nvidia is serious about burying the next Gen consoles this time around.

bug · Oct 8, 2020

Zach_01 said:
You still do not answer anything. Where exactly are basing the poor asumption that RDNA2 NAVI will compete only with GA104. Based on the 256bit bus?

Its the too much pessimist thoughts and the presented doomed future of AMD graphics devision, that is making me be part of this discussion.
My thoughts are different. I can almost see a repeat in "history" but not the one that negative-to-RDNA people see.
This could be a new ZEN case with RDNA1 to be the ZEN and RDNA2 to be ZEN2/3

I guess we will see in 20 days

Well, if your best argument is Zen, I should point out Zen was a completely new design. RDNA2 is a refresh of something that was good, better than expected even, but ultimately fell short. And it fell short while the competition was severely overpriced.

gruffi · Oct 9, 2020

Vayra86 said:
They'll end up between 3070 and 3080, but they won't fight the 3080 with that bandwidth. Just not happening.

HD 7970: 384 bit, 264 GB/s
GTX 680: 256 bit, 192.3 GB/s

With ~27% less memory bandwidth GTX 680 could compete with HD 7970.

R9 390X: 512 bit, 384.0 GB/s
GTX 980: 256 bit, 224.4 GB/s

With ~42% less memory bandwidth GTX 980 could more than just compete with RX 390X.

RX Vega 64: 2048 bit, 483.8 GB/s
GTX 1080: 256 bit, 320.3 GB/s

With ~34% less memory bandwidth GTX 1080 could compete with RX Vega 64.

You see, bandwidth alone doesn't say much. You have to get the whole picture. RTX 3080 has a bandwidth of 760.3 GB/s. Big Navi is expected to have a bandwidth of >500 GB/s. Which might be something like 30-35% less bandwidth than 3080. But as you can see, you can compete with even such a deficit if your architecture is well optimized.

Zach_01 · Oct 9, 2020

bug said:
Well, if your best argument is Zen, I should point out Zen was a completely new design. RDNA2 is a refresh of something that was good, better than expected even, but ultimately fell short. And it fell short while the competition was severely overpriced.

So now you know what kind of architecture the cooked up this round... and that is your argument. Based on what? ..on the bad choices that AMD made in the past?
Well, tell us more and give us the spoils! This is indeed entertaining.

Vayra86 · Oct 9, 2020

gruffi said:
HD 7970: 384 bit, 264 GB/s
GTX 680: 256 bit, 192.3 GB/s

With ~27% less memory bandwidth GTX 680 could compete with HD 7970.

R9 390X: 512 bit, 384.0 GB/s
GTX 980: 256 bit, 224.4 GB/s

With ~42% less memory bandwidth GTX 980 could more than just compete with RX 390X.

RX Vega 64: 2048 bit, 483.8 GB/s
GTX 1080: 256 bit, 320.3 GB/s

With ~34% less memory bandwidth GTX 1080 could compete with RX Vega 64.

You see, bandwidth alone doesn't say much. You have to get the whole picture. RTX 3080 has a bandwidth of 760.3 GB/s. Big Navi is expected to have a bandwidth of >500 GB/s. Which might be something like 30-35% less bandwidth than 3080. But as you can see, you can compete with even such a deficit if your architecture is well optimized.

Did you read the news yet?

Besides, what you're saying is true but AMD doesn't have the Nvidia headstart of better delta compression (980, 1080) at any point in time. Its not optimization, its feature set that made that possible.

Also, the 7970 did age a whole lot better, as in several years of actual practical use out of it, than the 680. Not directly due to bandwidth, but capacity. In none of the three examples is bandwidth the true factor making the difference, really. Nvidia just had a much stronger architecture across the board from Maxwell onwards.

gruffi · Oct 10, 2020

Vayra86 said:
Besides, what you're saying is true but AMD doesn't have the Nvidia headstart of better delta compression (980, 1080) at any point in time. Its not optimization, its feature set that made that possible.

And you know the feature set of RDNA 2? Do you? No, you don't. That's why it's pointless to say it's impossible.

Vayra86 said:
In none of the three examples is bandwidth the true factor making the difference, really. Nvidia just had a much stronger architecture across the board from Maxwell onwards.

No. It wasn't "a much stronger architecture". In fact AMD had a much stronger architecture until Pascal. At least if we talk about raw performance. Nvidia's architecture was just more optimized for gaming. But it's funny. You say in the examples bandwidth doesn't make a difference. But you already know it will make a difference with RDNA 2. Makes sense. :rolleyes:

Valantar · Oct 10, 2020

gruffi said:
And you know the feature set of RDNA 2? Do you? No, you don't. That's why it's pointless to say it's impossible.

No. It wasn't "a much stronger architecture". In fact AMD had a much stronger architecture until Pascal. At least if we talk about raw performance. Nvidia's architecture was just more optimized for gaming. But it's funny. You say in the examples bandwidth doesn't make a difference. But you already know it will make a difference with RDNA 2. Makes sense.

Sorry, but you're quite off here. Yes, GCN (up to and including Vega) was very strong for pure compute workloads, but it was not very good at translating that into gaming performance. If your main use for a GPU is compute, then that's great, though RDNA will obviously disappoint you as the focus there is on improving gaming performance rather than compute (that's what CDNA is for). You can't both argue that Nvidia didn't have a huge advantage because compute is as important as gaming, and then argue that RDNA improving its gaming performance means it's now better than Nvidia. That's what we call a double standard. Besides, efficiency is also a (major!) factor in the quality of an architecture, and with Maxwell Nvidia took a major step in front of AMD there - and has held that position since. RDNA 1 in combination with TSMC 7nm brought AMD back to rough parity, so it'll be very interesting to see how improved 7nm RDNA 2 vs. 8nm Ampere plays out.

As to your first point, AMD might very well have improved their delta color compression so much that it beats Nvidia's, but if so that wouldn't negate the fact that Nvidia has had the advantage there for four+ generations. That would of course make overtaking them all the more impressive, but your argument has fundamental logical flaws.

Vayra86 · Oct 10, 2020

gruffi said:
And you know the feature set of RDNA 2? Do you? No, you don't. That's why it's pointless to say it's impossible.

No. It wasn't "a much stronger architecture". In fact AMD had a much stronger architecture until Pascal. At least if we talk about raw performance. Nvidia's architecture was just more optimized for gaming. But it's funny. You say in the examples bandwidth doesn't make a difference. But you already know it will make a difference with RDNA 2. Makes sense.

Nah, reading comprehension buddy, try again. Im being very specific in my response to your examples; no need to pull it out of context. We speak of gaming performance here.

Raw compute is pretty pointless when a competitor dominates the market with optimized CUDA workloads anyway. So even outside of gaming, wtf are you even on about. Spec sheets dont get work done last I checked.

gruffi · Oct 10, 2020

Valantar said:
Sorry, but you're quite off here. Yes, GCN (up to and including Vega) was very strong for pure compute workloads, but it was not very good at translating that into gaming performance.

Isn't that exactly what I said?

Valantar said:
If your main use for a GPU is compute, then that's great, though RDNA will obviously disappoint you as the focus there is on improving gaming performance rather than compute (that's what CDNA is for). You can't both argue that Nvidia didn't have a huge advantage because compute is as important as gaming, and then argue that RDNA improving its gaming performance means it's now better than Nvidia.

I never argued anything like that.

Vayra86 said:
Im being very specific in my response to your examples; no need to pull it out of context. We speak of gaming performance here.

I was speaking about gaming performance too. You just weren't very specific. Being "strong" can mean anything. I was specific and clarified what was strong and what was not.

Okay, then how about some facts. You said RDNA 2 won't fight the 3080 with that bandwidth. Give us some facts about RDNA 2 why it won't happen. No opinions, no referring to old GCN stuff, just hard facts about RDNA.

Valantar · Oct 10, 2020

gruffi said:
Isn't that exactly what I said?

No. What you said was

gruffi said:
In fact AMD had a much stronger architecture until Pascal. At least if we talk about raw performance. Nvidia's architecture was just more optimized for gaming.

That is, quite literally, turning what I said on its head. This is a forum for computer enthusiasts. While there are of course quite a few enthusiasts who have a lot of use for pure compute in what they use their computers for, the vast majority need their GPUs for gaming. Consumer/enthusiast GPUs are also explicitly designed around gaming, not compute. As such, saying that AMD had the better architecture because they delivered more FP32 even if they were lagging in gaming performance is turning things very much on their head. It might be that this doesn't apply to you, but from what I've seen you haven't stated as much, so I have to base my interpretations on what is generally true on forums like these.

Besides that, you're still ignoring efficiency. Let's start back in 2013:
Radeon 290X: $550, 5.6TFlops, 438mm² (12.8Gflops/mm²), 290W (19.3Gflops/W), 100% gaming performance.
Geforce GTX 780 Ti: $699, 5.3Tflops, 561mm ² (9.4Gflops/mm²), 250W (21.2Gflops/W), 104% gaming performance

Radeon Fury X: $699, 8.6TFlops, 596mm² (14.4Gf/mm²), 275W (31.3Gflops/W), 131% gaming performance
Geforce GTX 980 Ti: $699, 6.1Tflops, 601mm² (10.1Gf/mm²), 250W (24,4Gflops/W), 133% gaming performance

Radeon Vega 64: $499, 12.7Tflops, 495mm² (25.7Gf/mm²), 295W (43.1Gflops/W), 173% gaming performance
Geforce GTX 1080 Ti: $699, 11.3Tflops, 471mm² (24Gf/mm²), 250W (45.2Gflops/W), 223% gaming performance.

So, what was AMD good at? Delivering FP32 compute for cheap (compared to Nvidia). For some generations they kept pace in terms of gaming performance too, but always at the cost of higher power, and in the Fury X (still using mine!) and onwards that's partially thanks to exotic and expensive memory that's dramatically more efficient than GDDR. They also delivered quite good compute per die area. In gaming they kept up at best, lagged behind dramatically at worst (though then also at a lower price).

What can we extrapolate from this? That GCN was a good architecture for compute. It was very clearly a worse architecture than what Nvidia had to offer overall, as compute is not the major relevant use case for any consumer GPU. So, in any perspective other than that of someone running a render farm, AMD's architecture was clearly worse than Nvidia's.

This is very clearly demonstrated by RDNA: The 5700 XT matches the Radeon VII in gaming performance despite a significant drop in compute performance. It also dramatically increases gaming performance/W, though compute/W is down from the VII.

gruffi said:
I never argued anything like that.

But you did. You said AMD had a "much stronger architecture" until Pascal. Which means that you're arguing that compute is more important than gaming performance, as that is the only metric in which they were better. Yet you're in a discussion about whether RDNA 2 can match or beat Ampere in gaming performance based on rumored memory bandwidths, arguing against someone skeptical of this. While we know that RDNA is a worse architecture for compute than GCN, watt for watt on the same node, and you're arguing for RDNA 2 likely being very good - which implies that more gaming performance = better. So whether you meant there to be or not, there is a distinct reference point shift between those two parts of your arguments.

Nkd · Oct 11, 2020

ShurikN said:
Either Big Navi is not high end (hence 256-bit bus), and was never meant to compete with GA102,
OR
it is high end and has some sort of hidden mumbo-jumbo, in this case Infinity Cache (aka very large cache) to offset the bandwidth.

Do you ppl really think AMD (it's engineers) went and made a 3080 competitor and then one day sat at a table and went "You know what this bad boy needs, a crippled memory bus. Let us go fuck this chip up so much that no one will ever buy it". And then everyone clapped and popped champagne bottles and ate caviar, confetti was flying, strippers came and everything.

well rumor is they did test big navi with 384bit bus and with cache and 256bit bus. Looks like the difference wasn't enough to justify 384bit bus that will add to the process and make it more expensive. So what they have must be sufficient.

gruffi · Oct 11, 2020

Valantar said:
No. What you said was

Which in fact is absolutely the same statement. I really don't know what you are reading here.

Valantar said:
As such, saying that AMD had the better architecture because they delivered more FP32 ...

Again, I never said anything like that. Please read what I said I don't make up things. Where did I write something about "better architecture"? The topic was "stronger architecture". And in terms of raw performance AMD had a stronger architecture until Pascal. That's what I said. You just repeated it. I never said AMD's architecture was stronger (or better) at gaming.

Valantar said:
Besides that, you're still ignoring efficiency.

No, I don't ignoring it. It just wasn't the topic.

Valantar said:
You said AMD had a "much stronger architecture" until Pascal.

Yes. But you should read my whole statement and not just one sentence. I said "if we talk about raw performance". And that's true. I never said AMD had a "much stronger architecture for gaming". That's just what you read. But I didn't say it. So, please accept your mistake and don't make it up even more.

Let me sum it up for you again. That's what I said:

AMD had a much stronger architecture ... if we talk about raw performance ... Nvidia's architecture was just more optimized for gaming

And that's what you said

GCN ... was very strong for pure compute workloads, but it was not very good at translating that into gaming performance.

Which is the very same statement. Just expressed in other words.

Valantar · Oct 11, 2020

gruffi said:
Which in fact is absolutely the same statement. I really don't know what you are reading here.

Again, I never said anything like that. Please read what I said I don't make up things. Where did I write something about "better architecture"? The topic was "stronger architecture". And in terms of raw performance AMD had a stronger architecture until Pascal. That's what I said. You just repeated it. I never said AMD's architecture was stronger (or better) at gaming.

No, I don't ignoring it. It just wasn't the topic.

Yes. But you should read my whole statement and not just one sentence. I said "if we talk about raw performance". And that's true. I never said AMD had a "much stronger architecture for gaming". That's just what you read. But I didn't say it. So, please accept your mistake and don't make it up even more.

Let me sum it up for you again. That's what I said:

And that's what you said

Which is the very same statement. Just expressed in other words.

This is getting repetitive, but again: no.

You are presenting an argument from a point of view where the "strength" of a GPU architecture is apparently only a product of its FP32 compute prowess. I am presenting a counterargument saying that this is a meaningless measure for home/enthusiast uses, both due to your argument ignoring efficiency (which is always relevant when discussing an architecture, as better efficiency = more performance in a given power envelope) and due to FP32 compute being of relatively low importance to this user group. You are also for some reason equating FP32 compute to "raw performance", which is a stretch given the many tasks a GPU can perform. FP32 is of course one of the more important ones, but it alone is a poor measure of the performance of a GPU, particularly outside of enterprise use cases.

Put more simply: you are effectively saying "GCN was a good architecture, but bad at gaming" while I am saying "GCN was a mediocre architecture, but good at compute." The point of reference and meaning put into what amounts to a good architecture in those two statements are dramatically different. As for saying "strong" rather than "good" or whatever else: these are generic terms without specific meanings in this context. Trying to add a post-hoc definition doesn't make the argument any more convincing.

bug · Oct 11, 2020

Zach_01 said:
So now you know what kind of architecture the cooked up this round... and that is your argument. Based on what? ..on the bad choices that AMD made in the past?
Well, tell us more and give us the spoils! This is indeed entertaining.

Based on the fact that no architecture is built for a single generation. And it's in the name RDNA2.

Vayra86 · Oct 11, 2020

gruffi said:
Isn't that exactly what I said?

I never argued anything like that.

I was speaking about gaming performance too. You just weren't very specific. Being "strong" can mean anything. I was specific and clarified what was strong and what was not.

Okay, then how about some facts. You said RDNA 2 won't fight the 3080 with that bandwidth. Give us some facts about RDNA 2 why it won't happen. No opinions, no referring to old GCN stuff, just hard facts about RDNA.

Read back or on other topics, been over this at length already.

System Name	Hotbox
Processor	AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard	ASRock Phantom Gaming B550 ITX/ax
Cooling	LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory	32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s)	PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage	2TB Adata SX8200 Pro
Display(s)	Dell U2711 main, AOC 24P2C secondary
Case	SSUPD Meshlicious
Audio Device(s)	Optoma Nuforce μDAC 3
Power Supply	Corsair SF750 Platinum
Mouse	Logitech G603
Keyboard	Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software	Windows 10 Pro

System Name	PC on since March 2025, upgraded from 5900X
Processor	Ryzen 7 9700X (March 2025), 140W PPT limit, 85C temp limit, CO -25, +100MHz (up to 5.65GHz)
Motherboard	Asrock X870E NOVA, BIOS v3.2, AGESA PI 1.2.0.3a Patch A
Cooling	Arctic Liquid Freezer II 420mm Rev7 (Jan 2024) with off-center mount for Ryzen, TIM: Kryosheet
Memory	2x32GB G.Skill Trident Z5 RGB (March2025) 6000MT/s 1.40V CL30-36-36-36-68-104 1T, tRFC:500, Hynix-A
Video Card(s)	Sapphire Nitro+ RX 7900XTX (Dec 2023) 314~467W (370W current) PowerLimit, 1070mV, Adrenalin v25.5.1
Storage	NVMe: 990Pro 2TB(OS 25), 980Pro 1TB(22), 970Pro 512(19) / S-III: 850Pro 1TB(15) 860Evo 1TB(20)
Display(s)	Dell Alienware AW3423DW 34" QD-OLED curved (1800R), 3440x1440 144Hz (max 175Hz) HDR400/1000, VRR on
Case	Thermaltake Core P8 TG Gaming Full Tower, Fans: 9x140mm + 3x120mm
Audio Device(s)	Astro A50 headset
Power Supply	Corsair HX750i, ATX v2.4, 80+ Platinum, 93% (250~700W), modular, single/dual rail (switch)
Mouse	Logitech MX Master (Gen1)
Keyboard	Logitech G15 (Gen2) w/ LCDSirReal applet
Software	Windows 11 Home 64bit (v24H2, OSBuild 26100.4061), 1st install March 2025

System Name	The de-ploughminator Mk-III
Processor	9800X3D
Motherboard	Gigabyte X870E Aorus Master
Cooling	DeepCool AK620
Memory	2x32GB G.SKill 6400MT Cas32
Video Card(s)	Asus Astral 5090 LC OC
Storage	4TB Samsung 990 Pro
Display(s)	48" LG OLED C4
Case	Corsair 5000D Air
Audio Device(s)	KEF LSX II LT speakers + KEF KC62 Subwoofer
Power Supply	Corsair HX1200
Mouse	Razor Death Adder v3
Keyboard	Razor Huntsman V3 Pro TKL
Software	win11

System Name	PC on since March 2025, upgraded from 5900X
Processor	Ryzen 7 9700X (March 2025), 140W PPT limit, 85C temp limit, CO -25, +100MHz (up to 5.65GHz)
Motherboard	Asrock X870E NOVA, BIOS v3.2, AGESA PI 1.2.0.3a Patch A
Cooling	Arctic Liquid Freezer II 420mm Rev7 (Jan 2024) with off-center mount for Ryzen, TIM: Kryosheet
Memory	2x32GB G.Skill Trident Z5 RGB (March2025) 6000MT/s 1.40V CL30-36-36-36-68-104 1T, tRFC:500, Hynix-A
Video Card(s)	Sapphire Nitro+ RX 7900XTX (Dec 2023) 314~467W (370W current) PowerLimit, 1070mV, Adrenalin v25.5.1
Storage	NVMe: 990Pro 2TB(OS 25), 980Pro 1TB(22), 970Pro 512(19) / S-III: 850Pro 1TB(15) 860Evo 1TB(20)
Display(s)	Dell Alienware AW3423DW 34" QD-OLED curved (1800R), 3440x1440 144Hz (max 175Hz) HDR400/1000, VRR on
Case	Thermaltake Core P8 TG Gaming Full Tower, Fans: 9x140mm + 3x120mm
Audio Device(s)	Astro A50 headset
Power Supply	Corsair HX750i, ATX v2.4, 80+ Platinum, 93% (250~700W), modular, single/dual rail (switch)
Mouse	Logitech MX Master (Gen1)
Keyboard	Logitech G15 (Gen2) w/ LCDSirReal applet
Software	Windows 11 Home 64bit (v24H2, OSBuild 26100.4061), 1st install March 2025

System Name	Tiny the White Yeti
Processor	7800X3D
Motherboard	MSI MAG Mortar b650m wifi
Cooling	CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory	32GB Corsair Vengeance 30CL6000
Video Card(s)	ASRock RX7900XT Phantom Gaming
Storage	Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s)	Gigabyte G34QWC (3440x1440)
Case	Lian Li A3 mATX White
Audio Device(s)	Harman Kardon AVR137 + 2.1
Power Supply	EVGA Supernova G2 750W
Mouse	Steelseries Aerox 5
Keyboard	Lenovo Thinkpad Trackpoint II
VR HMD	HD 420 - Green Edition ;)
Software	W11 IoT Enterprise LTSC
Benchmark Scores	Over 9000

Processor	Intel i5-12600k
Motherboard	Asus H670 TUF
Cooling	Arctic Freezer 34
Memory	2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s)	EVGA GTX 1060 SC
Storage	500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s)	Dell U3219Q + HP ZR24w
Case	Raijintek Thetis
Audio Device(s)	Audioquest Dragonfly Red :D
Power Supply	Seasonic 620W M12
Mouse	Logitech G502 Proteus Core
Keyboard	G.Skill KM780R
Software	Arch Linux + Win10