• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Why AMD will perform better than NVIDIA in DirectX 12

Status
Not open for further replies.

nem

Joined
Oct 22, 2013
Messages
165 (0.04/day)
Location
Cyberdyne CPU Sky Net
Ok people what think about this great great explanation about why AMD should be better than NVIDIA over DirectX12 for have best supports the Shaders asynchronous. Check this is not my argument but It seems well argued.

first the souce:http://www.overclock.net/t/1569897/...singularity-dx12-benchmarks/400#post_24321843

Well I figured I'd create an account in order to explain away what you're all seeing in the Ashes of the Singularity DX12 Benchmarks. I won't divulge too much of my background information but suffice to say
that I'm an old veteran who used to go by the handle ElMoIsEviL.


First off nVidia is posting their true DirectX12 performance figures in these tests. Ashes of the Singularity is all about Parallelism and that's an area, that although Maxwell 2 does better than previous nVIDIA architectures, it is still inferior in this department when compared to the likes of AMDs GCN 1.1/1.2 architectures. Here's why...


Maxwell's Asychronous Thread Warp can queue up 31 Compute tasks and 1 Graphic task. Now compare this with AMD GCN 1.1/1.2 which is composed of 8 Asynchronous Compute Engines each able to queue 8 Compute tasks for a total of 64 coupled with 1 Graphic task by the Graphic Command Processor. See bellow:

900x900px-LL-489247b8_Async_Aces_575px.png



Each ACE can also apply certain Post Processing Effects without incurring much of a performance penalty. This feature is heavily used for Lighting in Ashes of the Singularity. Think of all of the simultaneous light sources firing off as each unit in the game fires a shot or the various explosions which ensue as examples.

900x900px-LL-89354727_asynchronous-performance-liquid-vr.jpeg



This means that AMDs GCN 1.1/1.2 is best adapted at handling the increase in Draw Calls now being made by the Multi-Core CPU under Direct X 12.


Therefore in game titles which rely heavily on Parallelism, likely most DirectX 12 titles, AMD GCN 1.1/1.2 should do very well provided they do not hit a Geometry or Rasterizer Operator bottleneck before nVIDIA hits
their Draw Call/Parallelism bottleneck. The picture bellow highlights the Draw Call/Parallelism superioty of GCN 1.1/1.2 over Maxwell 2:

900x900px-LL-7d8a8295_drawcalls.jpeg



A more efficient queueing of workloads, through better thread Parallelism, also enables the R9 290x to come closer to its theoretical Compute figures which just happen to be ever so shy from those of the GTX 980 Ti (5.8 TFlops vs 6.1 TFlops respectively) as seen bellow:

900x900px-LL-92367ca0_Compute_01b.jpeg



What you will notice is that Ashes of the Singularity is also quite hard on the Rasterizer Operators highlighting a rather peculiar behavior. That behavior is that an R9 290x, with its 64 Rops, ends up performing near the same as a Fury-X, also with 64 Rops. A great way of picturing this in action is from the Graph bellow (courtesy of Beyond3D):

900x900px-LL-bd73e764_Compute_02b.jpeg



As for the folks claiming a conspiracy theory, not in the least. The reason AMDs DX11 performance is so poor under Ashes of the Singularity is because AMD literally did zero optimizations for the path. AMD is
clearly looking on selling Asynchronous Shading as a feature to developers because their architecture is well suited for the task. It doesn't hurt that it also costs less in terms of Research and Development of drivers. Asynchronous Shading allows GCN to hit near full efficiency without even requiring any driver work whatsoever.


nVIDIA, on the other hand, does much better at Serial scheduling of work loads (when you consider that anything prior to Maxwell 2 is limited to Serial Scheduling rather than Parallel Scheduling). DirectX 11 is
suited for Serial Scheduling therefore naturally nVIDIA has an advantage under DirectX 11. In this graph, provided by Anandtech, you have the correct figures for nVIDIAs architectures (from Kepler to Maxwell 2)
though the figures for GCN are incorrect (they did not multiply the number of Asynchronous Compute Engines by 8):

350x700px-LL-56aa0659_anandtech_wrong.jpeg



People wondering why Nvidia is doing a bit better in DX11 than DX12. That's because Nvidia optimized their DX11 path in their drivers for Ashes of the Singularity. With DX12 there are no tangible driver optimizations because the Game Engine speaks almost directly to the Graphics Hardware. So none were made. Nvidia is at the mercy of the programmers talents as well as their own Maxwell architectures thread parallelism performance under DX12. The Devellopers programmed for thread parallelism in Ashes of the Singularity in order to be able to better draw all those objects on the screen. Therefore what were seeing with the Nvidia numbers is the Nvidia draw call bottleneck showing up under DX12. Nvidia works around this with its own optimizations in DX11 by prioritizing workloads and replacing shaders. Yes, the nVIDIA driver contains a compiler which re-compiles and replaces shaders which are not fine tuned to their architecture on a per game basis. NVidia's driver is also Multi-Threaded, making use of the idling CPU cores in order to recompile/replace shaders. The work nVIDIA does in software, under DX11, is the work AMD do in Hardware, under DX12, with their Asynchronous Compute Engines.


But what about poor AMD DX11 performance? Simple. AMDs GCN 1.1/1.2 architecture is suited towards Parallelism. It requires the CPU to feed the graphics card work. This creates a CPU bottleneck, on AMD hardware, under DX11 and low resolutions (say 1080p and even 1600p for Fury-X), as DX11 is limited to 1-2 cores for the Graphics pipeline (which also needs to take care of AI, Physics etc). Replacing shaders or
re-compiling shaders is not a solution for GCN 1.1/1.2 because AMDs Asynchronous Compute Engines are built to break down complex workloads into smaller, easier to work, workloads. The only way around this issue, if you want to maximize the use of all available compute resources under GCN 1.1/1.2, is to feed the GPU in Parallel... in comes in Mantle, Vulcan and Direct X 12.


People wondering why Fury-X did so poorly in 1080p under DirectX 11 titles? That's your answer.



A video which talks about Ashes of the Singularity in depth:

PS. Don't count on better Direct X 12 drivers from nVIDIA. DirectX 12 is closer to Metal and it's all on the developer to make efficient use of both nVIDIA and AMDs architectures.

original.jpg
 
Last edited:
interesting read... thx!
 
  • Like
Reactions: nem
How about GCN 1.0 ? :D
 
The title should perhaps be, "why DX12 favours GCN1.1 better than it did in DX11".

It would be 'misguided' to assume DX12 will hand AMD the lead. Given Maxwell 2 is supposed to perform better under DX12 (just not as much as AMD) than DX11, then the game code can swing it, bare metal or not. The games in development for DX12 still require coding (as the reviews mention). Now, a game that pushes the limit on ROP's will hamstring AMD, pushing 290X cards to the same level as Fury cards. Similarly, if the ROP's aren't flooded, AMD will get the headroom over Nvidia.
I'm not a fan of 'Gameworks' but it should be considered that a game code can easily be written to hamper performance (intentionally or not) of a competitor.
Any game Nvidia gets involved with for DX12 simply needs to focus on Maxwell's hardware DX12 strengths.

Like has been said before, this single benchmark is not a good indicator of what will be.
And again, DX12 needs some time to come to fruition in a gaming sense. AMD in an even playing field have the edge, mostly, but let's not forget, its not an even playing field.
I know I'll get called a fanboy by some folks but I'm just shining a light into the dark here. AoS isn't brand agnostic. The company was the first to use Mantle, so its naive to think it's handed an unbiased bench opportunity. Again, if a game requires those ROPs then AMD doesn't do so well if the load is too high.
When UE4 Unity and other DX12 relevant software appears, we can revisit the argument but until then, one bench isn't enough.
AMD has excellent DX12 hardware but the game code may not always favour it.

FTR, I have no problem buying AMD cards. I waited for Fury X to release but decided against it. Maybe it's too ahead of its time. I'll buy one next year maybe!
 
Last edited:
Powerpoint slides don't mean jack shit. Rumours and hype, on the other hand...
 
Powerpoint slides don't mean jack shit. Rumours and hype, on the other hand...

That's not the point of the OP. Numerous independent sites show AoS doing very well on AMD GCN 1.1 hardware. It's clear to see.
The issue to me is that a game will use various DX12 elements and the engine merely has to tip the balance one way or the other, ROP's versus pure draw call for example.
These slides are genuine and not rumour but the expectation is perhaps being hyped too much, with (I believe) AoS being absolute best case scenario for AMD (not because of its Mantle pedigree but the fact it was Stardock that created it, who did the Mantle demo for AMD).
 
Amd fans really like to dream the pipe dream. How's dx 12_1 on amd cards coming along?

How many DX12_1 games are there?

None? Oh, ok there's at least a tech demo right?

No? Ouch.
 
How about some moron (not meant for you RTB!) doesn't post another reply with no substance? It would be nice if a constructive thread developed with views about upcoming limitations or possibilities of hardware.
I'd like to know for example, what upcoming titles will be utilising DX12, what company 'helps out the developers' and what the release date is.
BF SW, Fallout 4, The Division etc, these are all AAA DX11 games. What's the first major title (I know Fable) that will use DX12.
 
Ok people what think about this great great explanation about why AMD should be better than NVIDIA over DirectX12 for have best supports the Shaders asynchronous. Check this is not my argument but It seems well argued.

first the souce:http://www.overclock.net/t/1569897/...singularity-dx12-benchmarks/400#post_24321843

Well I figured I'd create an account in order to explain away what you're all seeing in the Ashes of the Singularity DX12 Benchmarks. I won't divulge too much of my background information but suffice to say
that I'm an old veteran who used to go by the handle ElMoIsEviL.


First off nVidia is posting their true DirectX12 performance figures in these tests. Ashes of the Singularity is all about Parallelism and that's an area, that although Maxwell 2 does better than previous nVIDIA architectures, it is still inferior in this department when compared to the likes of AMDs GCN 1.1/1.2 architectures. Here's why...


Maxwell's Asychronous Thread Warp can queue up 31 Compute tasks and 1 Graphic task. Now compare this with AMD GCN 1.1/1.2 which is composed of 8 Asynchronous Compute Engines each able to queue 8 Compute tasks for a total of 64 coupled with 1 Graphic task by the Graphic Command Processor. See bellow:

900x900px-LL-489247b8_Async_Aces_575px.png



Each ACE can also apply certain Post Processing Effects without incurring much of a performance penalty. This feature is heavily used for Lighting in Ashes of the Singularity. Think of all of the simultaneous light sources firing off as each unit in the game fires a shot or the various explosions which ensue as examples.

900x900px-LL-89354727_asynchronous-performance-liquid-vr.jpeg



This means that AMDs GCN 1.1/1.2 is best adapted at handling the increase in Draw Calls now being made by the Multi-Core CPU under Direct X 12.


Therefore in game titles which rely heavily on Parallelism, likely most DirectX 12 titles, AMD GCN 1.1/1.2 should do very well provided they do not hit a Geometry or Rasterizer Operator bottleneck before nVIDIA hits
their Draw Call/Parallelism bottleneck. The picture bellow highlights the Draw Call/Parallelism superioty of GCN 1.1/1.2 over Maxwell 2:

900x900px-LL-7d8a8295_drawcalls.jpeg



A more efficient queueing of workloads, through better thread Parallelism, also enables the R9 290x to come closer to its theoretical Compute figures which just happen to be ever so shy from those of the GTX 980 Ti (5.8 TFlops vs 6.1 TFlops respectively) as seen bellow:

900x900px-LL-92367ca0_Compute_01b.jpeg



What you will notice is that Ashes of the Singularity is also quite hard on the Rasterizer Operators highlighting a rather peculiar behavior. That behavior is that an R9 290x, with its 64 Rops, ends up performing near the same as a Fury-X, also with 64 Rops. A great way of picturing this in action is from the Graph bellow (courtesy of Beyond3D):

900x900px-LL-bd73e764_Compute_02b.jpeg



As for the folks claiming a conspiracy theory, not in the least. The reason AMDs DX11 performance is so poor under Ashes of the Singularity is because AMD literally did zero optimizations for the path. AMD is
clearly looking on selling Asynchronous Shading as a feature to developers because their architecture is well suited for the task. It doesn't hurt that it also costs less in terms of Research and Development of drivers. Asynchronous Shading allows GCN to hit near full efficiency without even requiring any driver work whatsoever.


nVIDIA, on the other hand, does much better at Serial scheduling of work loads (when you consider that anything prior to Maxwell 2 is limited to Serial Scheduling rather than Parallel Scheduling). DirectX 11 is
suited for Serial Scheduling therefore naturally nVIDIA has an advantage under DirectX 11. In this graph, provided by Anandtech, you have the correct figures for nVIDIAs architectures (from Kepler to Maxwell 2)
though the figures for GCN are incorrect (they did not multiply the number of Asynchronous Compute Engines by 8):

350x700px-LL-56aa0659_anandtech_wrong.jpeg



People wondering why Nvidia is doing a bit better in DX11 than DX12. That's because Nvidia optimized their DX11 path in their drivers for Ashes of the Singularity. With DX12 there are no tangible driver optimizations because the Game Engine speaks almost directly to the Graphics Hardware. So none were made. Nvidia is at the mercy of the programmers talents as well as their own Maxwell architectures thread parallelism performance under DX12. The Devellopers programmed for thread parallelism in Ashes of the Singularity in order to be able to better draw all those objects on the screen. Therefore what were seeing with the Nvidia numbers is the Nvidia draw call bottleneck showing up under DX12. Nvidia works around this with its own optimizations in DX11 by prioritizing workloads and replacing shaders. Yes, the nVIDIA driver contains a compiler which re-compiles and replaces shaders which are not fine tuned to their architecture on a per game basis. NVidia's driver is also Multi-Threaded, making use of the idling CPU cores in order to recompile/replace shaders. The work nVIDIA does in software, under DX11, is the work AMD do in Hardware, under DX12, with their Asynchronous Compute Engines.


But what about poor AMD DX11 performance? Simple. AMDs GCN 1.1/1.2 architecture is suited towards Parallelism. It requires the CPU to feed the graphics card work. This creates a CPU bottleneck, on AMD hardware, under DX11 and low resolutions (say 1080p and even 1600p for Fury-X), as DX11 is limited to 1-2 cores for the Graphics pipeline (which also needs to take care of AI, Physics etc). Replacing shaders or
re-compiling shaders is not a solution for GCN 1.1/1.2 because AMDs Asynchronous Compute Engines are built to break down complex workloads into smaller, easier to work, workloads. The only way around this issue, if you want to maximize the use of all available compute resources under GCN 1.1/1.2, is to feed the GPU in Parallel... in comes in Mantle, Vulcan and Direct X 12.


People wondering why Fury-X did so poorly in 1080p under DirectX 11 titles? That's your answer.



A video which talks about Ashes of the Singularity in depth:

PS. Don't count on better Direct X 12 drivers from nVIDIA. DirectX 12 is closer to Metal and it's all on the developer to make efficient use of both nVIDIA and AMDs architectures.

original.jpg
http://www.3dmark.com/3dm/8070955?
i got 17 501 040 drawcalls with titan x
 
http://www.3dmark.com/3dm/8070955?
i got 17 501 040 drawcalls with titan x
http://www.3dmark.com/aot/50003
Almost 18million on a GTX980, using factory gpu clocks and +300mhz on memory.

that graph he posted is debunked as incorrect.
I will say this 1 time, Just cause that 1 game is better on AMD right now doesn't mean it will stay that way. It is 1 game in alpha build. Its also 1 game that had Mantle since the start so AMD did have a bit of head start with the higher draw calls. So its very possible that performance on nvidia cards could improve over coming months.
 
I wonder how much has NVIDIA actually invested in DX12 from software end (drivers), considering there aren't any DX12 games and AMD has been bragging about it in a single DX12 game that has been developed with AMD onboard since day one basically...

I mean, anyone remembers the massive DX11 jump NVIDIA made some time ago? It was quite significant and it made them leading in DX11. It's not so much defending of NVIDIA because I have it now, but I just can't judge something based on one game and synthetic tests. We know how that went with GeForce FX. It was great on paper and in a lot of synthetic tests but rubbish in actual games... Or with AMD how it was great, but then had huge problems with tessellation (which also makes me wonder why they used factor 32 instead of 64)...
 
You can say that about this entire subject because the dx 12 games aren't out yet

Ashes of the singularity and a fair number of tech demoes.
 
I don't see what the big deal is. Most of us are not going to want to use our current cards in a year anyway. Regardless of if its NVIDIA or AMD. So in a year when there are a lot more DX12 games out, if in fact AMD is faster we will all just buy AMD at that time. I don't understand this brand loyalty stuff unless you are employed by one of the companies. NVIDIA is better NOW.
 
Nvidia did indeed optimize for DX11, both software- and hardware-wise. But is that a bad thing? This way Nvidia was able to sell their units with a much higher margin than AMD + providing better perf/watt at the same time thanks to that optimization. And by the time DX12 will see wide adoption Pascal will be out. So things have worked out rather nicely for Nvidia I'd say.

But it can only be healthy for us gamers that AMD is able to regain lost ground thanks to DX12.
 
Last edited:
oh yay. more charts, slides and graphs predicting nvidia's doom.

even if all this was true, it doesn't mean much to me. i game on a 4k TV where HDMI 2.0 is a must, which AMD for some reason decided to skip. the fabled DP to 4k 60Hz adapters haven't materialized yet.

if this turns out to be true, my next cards will be arctic island GPUs. but i've seen AMD over promise and under deliver enough times to be skeptical of their power points.
 
oh yay. more charts, slides and graphs predicting nvidia's doom.

More like calling them out on their typical crap.

I wouldn't really call it "predicting their doom." No one said this is going to ground the company, no one here is that stupid (hopefully).
 
Once we have games people actually want to play that are 100% DirectX 12, we will be onto completely new GPU generations for both AMD and Nvidia. All this garbage over a single game that has effectively blown up every geek forum into AMD vs NVIDIA jargon will be mute.

More like calling them out on their typical crap.

I wouldn't really call it "predicting their doom." No one said this is going to ground the company, no one here is that stupid (hopefully).

Well there is one guy, I won't name names, but ill give you a hint. Starts with Sony.
 
Nvidias current hardware focused on DX11.
AMD planned ahead (even with the console hardware) and aimed for what they wanted to become mantle/DX12.
With *CURRENT HARDWARE* AMD gets an advantage in DX12

the technical reasons vary with asynch shaders being a large part of it, but this weird black and white view of a complex issue is just bizarre... does AMD having a temporary lead make Nvidia cards E-peens shrink or something?
 
Nvidias current hardware focused on DX11.
AMD planned ahead (even with the console hardware) and aimed for what they wanted to become mantle/DX12.
With *CURRENT HARDWARE* AMD gets an advantage in DX12

the technical reasons vary with asynch shaders being a large part of it, but this weird black and white view of a complex issue is just bizarre... does AMD having a temporary lead make Nvidia cards E-peens shrink or something?

That planning ahead (Hawaii) meant the cards that were firmly bound in DX11 failed to overturn Nvidia's GPU lead. That planning ahead has brought the company to its knees. I have no doubt at all AoS is far superior on AMD hardware but it almost seems like its become a massive PR push for things that have no current relevance.
It does seem Nvidia have a significant potential disadvantage and perhaps they are bluffing their way through on this gen but I don't for a second think AMD have played the game well over the past two years.
Even now there is no reason why they can't allow AIB's to sell air cooled Fury X cards, with more appeal than an AIO water cooler. AMD are still being arses. They may have planned ahead but very little they have done is bringing financial success.
New management, a funding injection and broader R&D would benefit the company and the consumer far more.
 
That planning ahead (Hawaii) meant the cards that were firmly bound in DX11 failed to overturn Nvidia's GPU lead. That planning ahead has brought the company to its knees. I have no doubt at all AoS is far superior on AMD hardware but it almost seems like its become a massive PR push for things that have no current relevance.
It does seem Nvidia have a significant potential disadvantage and perhaps they are bluffing their way through on this gen but I don't for a second think AMD have played the game well over the past two years.
Even now there is no reason why they can't allow AIB's to sell air cooled Fury X cards, with more appeal than an AIO water cooler. AMD are still being arses. They may have planned ahead but very little they have done is bringing financial success.
New management, a funding injection and broader R&D would benefit the company and the consumer far more.

time will tell on that one, because they can sell old stock of older, tried and tested products for DX12 gaming where Nvidia need that R&D budget to catch up. AMD's profits could jump ahead because of this.
 
Well there is one guy, I won't name names, but ill give you a hint. Starts with Sony.

Ah, come on. I'm not that mean.

...not like I disagree, but...

... ok maybe I am.
 
time will tell on that one, because they can sell old stock of older, tried and tested products for DX12 gaming where Nvidia need that R&D budget to catch up. AMD's profits could jump ahead because of this.

In that position they'll have harder time selling Fiji. If Hawaii performs as well as 980ti in DX12, Fiji becomes a white elephant, bearing in mind on some of the AoS benches, Hawaii matches Fiji.

But given Nvidia's product development deviations, if DX12 does hurt them, expect an earlier version of Pascal on HBM1 if HBM2 looks too distant.
Nvidia rarely manage to stick to any projected plan (and usually it's because of problems).
But hey, if my card falls behind in the next year, I'll happily trade out for a 390X but then I'd be humped on DX11 games. It is true time will tell but frankly it is too early.
Nvidia zealots come out with conspiracy nonsense and AMD flag wearers think the battle's won.
Fact is, the battle was lost by AMD a long time ago, the next battle isn't until 2016 and the MASS adoption of DX12.
Nobody can afford to be smug.
 
Last edited:
...Even now there is no reason why they can't allow AIB's to sell air cooled Fury X cards...
limited core and/or hbm supplys.
foundry struggling to catch up is probably the reason for 6m delay on fury and all those paper launches.
 
Status
Not open for further replies.
Back
Top