Monday, May 27th 2019

AMD Announces Radeon RX 5700 Based on Navi: RDNA, 7nm, PCIe Gen4, GDDR6

AMD at its 2019 Computex keynote today unveiled the Radeon RX 5000 family of graphics cards that leverage its new Navi graphics architecture and 7 nm silicon fabrication process. Navi isn't just an incremental upgrade over Vega with a handful new technologies, but the biggest overhaul to AMD's GPU SIMD design since Graphics CoreNext, circa 2011. Called RDNA or Radeon DNA, the new compute unit by AMD is a clean-slate SIMD design with a 1.25X IPC uplift over Vega, an overhauled on-chip cache hierarchy, and a more streamlined graphics pipeline.

In addition, the architecture is designed to increase performance-per-Watt by 50 percent over Vega. The first part to leverage Navi is the Radeon RX 5700. AMD ran a side-by-side demo of the RX 5700 versus the GeForce RTX 2070 at Strange Brigade, where NVIDIA's $500 card was beaten. "Strange Brigade" is one game where AMD fares generally well as it is heavily optimized for asynchonous compute. Navi also ticks two big technology check-boxes, PCI-Express gen 4.0, and GDDR6 memory. AMD has planned a July availability for the RX 5700, and did not disclose pricing.
Add your own comment

202 Comments on AMD Announces Radeon RX 5700 Based on Navi: RDNA, 7nm, PCIe Gen4, GDDR6

#52
Aldain
xkm1948, post: 4054730, member: 50521"
no real time ray tracing tho
Truly..who cares??
Posted on Reply
#53
bug
Valantar, post: 4055006, member: 171585"
My numbers were from the 2070 review. The 1660 is an odd comparison for a card that's meant to compete with much more powerful cards.
I wasn't looking at a specific card, just at numbers put out by Nvidia vs AMD.
Posted on Reply
#54
Valantar
Vayra86, post: 4054982, member: 152404"
25% IPC and 50% perf/watt is probably in the best-case Strange Brigade scenario versus the worst-case Vega scenario.
That sentence makes no sense unless you're implying that they're comparing numbers from different benchmarks, which ... well, would be bonkers. Vega (up until now) is no worst-case scenario for efficiency for AMD - it's entirely on par with Polaris if not a tad better.

Vayra86, post: 4054982, member: 152404"
Also, the other twist here is the shader itself. Sure, it may get a lot faster, but if you get a lower count of them, all you really have is some reshuffling that leads to no performance gain. Turing is a good example of that. Perf per shader is up, but you get less shaders and the end result is that for example a TU106 with 2304 shaders ends up alongside a GP104 that rocks 2560 shaders. It gets better, if you then defend your perf/watt figure by saying 'perf/watt per shader', its not all too hard after all.
But you're ignoring market segmentation and product pricing here. Less shaders with more performance/w/shader means cheaper dies and cheaper cards at lower power and equivalent performance or higher performance at equivalent power. Overall Turing gives you a significant increase in shaders per product segment - they just cranked up the pricing to 11 to match, sadly.
Posted on Reply
#55
jabbadap
steen, post: 4054990, member: 149653"
Didn't you know? That's now called "async compute". ;)

TU concurrent int & fp is more flexible than just 32bit data types. Half floats & lower precision int ops can also be packed. Conceptually works well with VRS.
Well kind of true, async compute is capability of using graphics queue and compute queue at the same time. It really does not matter what precision are we speaking.
Posted on Reply
#56
londiste
Vayra86, post: 4054982, member: 152404"
25% IPC and 50% perf/watt is probably in the best-case Strange Brigade scenario versus the worst-case Vega scenario.
Perf/clock is 30 games at 4K Ultra settings with 4xAA (geomean?).
Perf/watt is Division 2 at 1440p Ultra settings.
https://www.amd.com/en/press-releases/2019-05-26-amd-announces-next-generation-leadership-products-computex-2019-keynote
AMD unveiled RDNA, the next foundational gaming architecture that was designed to drive the future of PC gaming, console, and cloud for years to come. With a new compute unit [10] design, RDNA is expected to deliver incredible performance, power and memory efficiency in a smaller package compared to the previous generation Graphics Core Next (GCN) architecture. It is projected to provide up to 1.25X higher performance-per-clock [11] and up to 1.5X higher performance-per-watt over GCN[12], enabling better gaming performance at lower power and reduced latency.
...
10. AMD APUs and GPUs based on the Graphics Core Next and RDNA architectures contain GPU Cores comprised of compute units, which are defined as 64 shaders (or stream processors) working together. GD-142
11. Testing done by AMD performance labs 5/23/19, showing a geomean of 1.25x per/clock across 30 different games @ 4K Ultra, 4xAA settings. Performance may vary based on use of latest drivers. RX-327
12. Testing done by AMD performance labs 5/23/19, using the Division 2 @ 25x14 Ultra settings. Performance may vary based on use of latest drivers. RX-325
Posted on Reply
#57
Valantar
bug, post: 4055008, member: 157434"
I wasn't looking at a specific card, just at numbers put out by Nvidia vs AMD.
If so, then my numbers are just as valid as yours. That's the danger of dealing with relative percentages - you can get big changes when the underlying numbers change just a little. I have no doubt AMD wants to present themselves in as positive a light as possible, but you seem to be going the diametrically opposite route.
Posted on Reply
#59
medi01
bug, post: 4055001, member: 157434"
+50% will not close that gap.
Simply undervolting VII without losing any perf beats a bunch of NVDA cards, including 2080:



there is a gap, but it's smaller than one thinks (especially when checking for it on sites favoring green games, like TPU does).

EarthDog, post: 4054997, member: 79836"
That time has gone...there was even a thread on it too a week or so back from wizard.
[MEDIA=giphy]9LZpYKd2CmV6ICxs17[/MEDIA]
Posted on Reply
#60
xkm1948
Aldain, post: 4055007, member: 170164"
Truly..who cares??
A lot, well except AMD fanboiz.

When you are late to the party, better bring more stuff. If RX5700 matches RTX line with performance it better be priced well, otherwise the lacking of feature set will hurt them in the eyes of general public.
Posted on Reply
#61
londiste
jabbadap, post: 4055027, member: 148195"
4K Ultra with 4xAA 14nm vega class? gpu, wonder what kind of FPS numbers are they getting...
"Previous generation GCN" might not even be Vega considering this will be successor to Polaris :)
Posted on Reply
#62
M2B
The Radeon VII is using HBM2 which is so much more efficient than the GDDR6 memory on Nvidia cards .(Uses around 30-35W less power if i'm not mistaken)
You're comparing Graphics Cards to Graphics Cards, not a GPU with another one.
Posted on Reply
#63
londiste
medi01, post: 4055029, member: 158537"
Simply undervolting VII without losing any perf beats a bunch of NVDA cards, including 2080:
*YMMV
Computerbase got one of the good ones, it would seem. There have been far worse examples in both review sites and retail.
Posted on Reply
#64
bug
Valantar, post: 4055018, member: 171585"
If so, then my numbers are just as valid as yours. That's the danger of dealing with relative percentages - you can get big changes when the underlying numbers change just a little. I have no doubt AMD wants to present themselves in as positive a light as possible, but you seem to be going the diametrically opposite route.
I'm not sure how you read that graph, but this is how I do it:
1. Half of Nvidia's cards are in the 90-100% relative efficiency range.
2. AMD cards are generally at 50% or less relative efficiency. Vega 56 does better, at 60%. Radeon VII does even better at 68%, but that's already on 7nm.

If I take the best case scenario, Vega 56 and add 50% to that, it still puts AMD at 90% of the most efficient Nvidia card. And Nvidia is still on 12nm.
Posted on Reply
#65
Darmok N Jalad
I wonder how much PCIe 4.0 is at play here, and is RX 5700 the best they have, or is the most efficient? It seems like there could be a 5800, but then why wouldn’t they lead off with that?
Posted on Reply
#66
InVasMani
londiste, post: 4054805, member: 169790"
Strange Brigade for comparison is meaningless. The game is known to lean anywhere from 10-20% towards AMD GPUs. We will have to wait for July to see how they really stack up.
Both 1.25x "IPC" as well as 1.5x power efficiency sound really good, that should bring Navi up to par with Turing, hopefully a little ahead considering it is on 7nm.
Well if it's comparable to RTX2070 at a bit lower price point that's not bad. The real question is how (NAVI/RDNA) setups with Zen2/X570 and crossfire? If a more cut down cheaper version of the RX5700 in crossfire is a lot more cost effective than a RTX2080 for example that would shake things up. I'd like to hope that most of the negative aspects to crossfire is mostly eliminated with PCIE 4.0 for a two card or even 3 card setup, but who knows. I'd certainly hope so. Time will tell how these things pan out.

Darmok N Jalad, post: 4055051, member: 170588"
I wonder how much PCIe 4.0 is at play here, and is RX 5700 the best they have, or is the most efficient? It seems like there could be a 5800, but then why wouldn’t they lead off with that?
Perhaps it needs more binning 7nm is still relatively new give it some time. Why wouldn't they lead with it perhaps TDP is a bit steep once you push frequency higher than they've already set it at.
Posted on Reply
#67
londiste
We do not know the price point. Leaks/rumors put it at $499.

Why would Crossfire suddenly be better than it has been so far? Bandwidth is not the main problem and even then the increase from PCI-e 3.0 to 4.0 would not alleviate the need for communication that much. At the other side bidirectional 100GB/s did really not make that noticeable of a difference either.
Posted on Reply
#68
steen
medi01, post: 4054995, member: 158537"
@btarunr
May I ask something about the choice of games by TPU?
So I check "average gaming" diff between VII and 2080 on TPU and computerbase.
TPU states nearly 20% diff, computerbase states it's half of that.
Oh well, I think, different games, different results.

But then somebody does 35 games comparison:
It's a simple hierarchy. Top dozen or so tend to favor AMD, bottom dozen favor Nvidia. Pick the games to get the result you want. Test setup/procedure/settings/areas tested can make a difference. Of course, TU104 tends to be more effective than Vega20 in the chart below.




jabbadap, post: 4055013, member: 148195"
Well kind of true, async compute is capability of using graphics queue and compute queue at the same time.
You're being generous. :) Your definition is fine ofc. (Or multiple queues). Not really directed at you anyway. I kept seeing it in other threads where concurrent int/fp=async compute.
It really does not matter what precision are we speaking.
Exactly correct, nor is it defined by the ability to pack int/fp in the graphics pipeline.

There's another interesting "fine wine" effect for Vega. With Win10 (1803 IIRC) MS started promoting DX 11.0 games on GCN to DX12 feature level 11.1 that enabled the HW schedulers so should result in better perf than release under Win7/8.
Posted on Reply
#69
medi01
steen, post: 4055064, member: 149653"
It's a simple hierarchy. Top dozen or so tend to favor AMD, bottom dozen favor Nvidia. Pick the games to get the result you want.
Thanks for linking a chart showing perf difference TWO TIMES SMALLER than TPU.
Somehow computerbase managed to pick a more balanced set smaller set of games, that match 35-ish game test results.
Posted on Reply
#70
Vayra86
Valantar, post: 4055011, member: 171585"
But you're ignoring market segmentation and product pricing here. Less shaders with more performance/w/shader means cheaper dies and cheaper cards at lower power and equivalent performance or higher performance at equivalent power. Overall Turing gives you a significant increase in shaders per product segment - they just cranked up the pricing to 11 to match, sadly.
Yes... and AMD is going to follow suit, so the net gain is zero for a consumer.

londiste, post: 4055017, member: 169790"
Perf/clock is 30 games at 4K Ultra settings with 4xAA (geomean?).
Perf/watt is Division 2 at 1440p Ultra settings.
https://www.amd.com/en/press-releases/2019-05-26-amd-announces-next-generation-leadership-products-computex-2019-keynote
That's nice but this is still AMD's little black box we're looking at, and based on history I'm using truckloads of salt with that. Especially when it comes to their GPUs. Still... there is hope, then, I guess :)

medi01, post: 4055069, member: 158537"
Thanks for linking a chart showing perf difference TWO TIMES SMALLER than TPU.
Somehow computerbase managed to pick a more balanced set smaller set of games, that match 35-ish game test results.
The relative number of games optimized towards Nvidia cards is way higher, so any 'representative' benchmark suite; as in, representative wrt the engines and games on the marketplace, is always going to favor Nvidia. But that still provides the most informative review/result, because gamers don't buy games based on the brand of their GPU.

What it really means and what you're actually saying is: AMD should be optimizing a far wider range of games instead of focusing on the handful that they get to run well. That is why AMD lost the DX11 race as well - too much looking at the horizon and how new APIs would save their ass, while Nvidia fine tuned around DX11.
Posted on Reply
#71
InVasMani
Latency decreases since you can push twice as much bandwidth in each direction to and from. AMD themselves said it themselves reduced latency, higher bandwidth, lower power. Literally all of those things would benefit crossfire and on a cut down version might even improve if they can improve overall efficiency in the process while salvaging imperfect die's by disabling parts of them. I don't know why Crossfire wouldn't be improved a bit, but how much of a improvement is tough to say definitively. I would think the micro stutter would be lessened quite a bit for a two card setup and even a three card setup though less dramatically in the latter case while a quad card setup would "in theory" be identical to a two card one for PCIE 4.0 at least.
Posted on Reply
#72
medi01
xkm1948, post: 4055038, member: 50521"
...lacking of feature set will hurt them...
Such as G-Sync
Oh, hold on...

Nobody cares about yet another NVDA "only me" solution, it needs to get major support across the board to get to anything, but gimmicks developed in a handful of games just because NVDA paid them for it.

At this point it is obvious who's chips are going to rock the next gen of major consoles (historically "it's not about graphics" Nintendo opting for NVDA's dead mobile platform chip is almost an insult in this context, with even multiplat games mostly avoiding porting to it).
Posted on Reply
#73
londiste
@InVasMani latency and bandwidth are not necessarily tied together.

medi01, post: 4055078, member: 158537"
Nobody cares about yet another NVDA "only me" solution, it needs to get major support across the board to get to anything, but gimmicks developed in a handful of games just because NVDA paid them for it.
You mean something standard like, say... DXR?
Posted on Reply
#74
medi01
londiste, post: 4055080, member: 169790"
You mean something standard like, say... DXR?
I remember that, my point still stands. (remind me, why it is a proprietary vendor extension in Vulkan)
NVDA was cooking something for years, found time when competition was absent in the highest end, spilled the beans.
Intel/AMD would need to agree that DXR approach is at all viable or the best from their POV.

Crytek has shown one doesn't even need dedicated (20-24% of Turing die) HW to do the RT gimmick:

Posted on Reply
#75
steen
medi01, post: 4055069, member: 158537"
Somehow computerbase managed to pick a more balanced set smaller set of games, that match 35-ish game test results.
Did you read the qualifier:
steen, post: 4055064, member: 149653"
Top dozen or so tend to favor AMD, bottom dozen favor Nvidia. Pick the games to get the result you want. Test setup/procedure/settings/areas tested can make a difference.
Do you really want sites to pick "balanced" games only for testing? Think carefully.

InVasMani, post: 4055073, member: 163695"
Latency decreases since you can push twice as much bandwidth in each direction to and from. AMD themselves said it themselves reduced latency, higher bandwidth, lower power. Literally all of those things would benefit crossfire
With the advent of modern game code/shading/post processing techniques, classic SLI/Xfire has to be built into the engines from the ground up. It's just a coding/profiling nightmare. DX12 mGPU is theoretically doable but tends to have performance regression & very little scales well.
Posted on Reply
Add your own comment