Monday, August 31st 2015

Lack of Async Compute on Maxwell Makes AMD GCN Better Prepared for DirectX 12

It turns out that NVIDIA's "Maxwell" architecture has an Achilles' heel after all, which tilts the scales in favor of competing AMD Graphics CoreNext architecture, in being better prepared for DirectX 12. "Maxwell" lacks support for async compute, one of the three highlight features of Direct3D 12, even as the GeForce driver "exposes" the feature's presence to apps. This came to light when game developer Oxide Games alleged that it was pressured by NVIDIA's marketing department to remove certain features in its "Ashes of the Singularity" DirectX 12 benchmark.

Async Compute is a standardized API-level feature added to Direct3D by Microsoft, which allows an app to better exploit the number-crunching resources of a GPU, by breaking down its graphics rendering tasks. Since NVIDIA driver tells apps that "Maxwell" GPUs supports it, Oxide Games simply created its benchmark with async compute support, but when it attempted to use it on Maxwell, it was an "unmitigated disaster." During to course of its developer correspondence with NVIDIA to try and fix this issue, it learned that "Maxwell" doesn't really support async compute at the bare-metal level, and that NVIDIA driver bluffs its support to apps. NVIDIA instead started pressuring Oxide to remove parts of its code that use async compute altogether, it alleges.
"Personally, I think one could just as easily make the claim that we were biased toward NVIDIA as the only "vendor" specific-code is for NVIDIA where we had to shutdown async compute. By vendor specific, I mean a case where we look at the Vendor ID and make changes to our rendering path. Curiously, their driver reported this feature was functional but attempting to use it was an unmitigated disaster in terms of performance and conformance so we shut it down on their hardware. As far as I know, Maxwell doesn't really have Async Compute so I don't know why their driver was trying to expose that. The only other thing that is different between them is that NVIDIA does fall into Tier 2 class binding hardware instead of Tier 3 like AMD which requires a little bit more CPU overhead in D3D12, but I don't think it ended up being very significant. This isn't a vendor specific path, as it's responding to capabilities the driver reports," writes Oxide, in a statement disputing NVIDIA's "misinformation" about the "Ashes of Singularity" benchmark in its press communications (presumably to VGA reviewers).

Given its growing market-share, NVIDIA could use similar tactics to keep game developers away from industry-standard API features that it doesn't support, and which rival AMD does. NVIDIA drivers tell Windows that its GPUs support DirectX 12 feature-level 12_1. We wonder how much of that support is faked at the driver-level, like async compute. The company is already drawing flack for using borderline anti-competitive practices with GameWorks, which effectively creates a walled garden of visual effects that only users of NVIDIA hardware can experience for the same $59 everyone spends on a particular game. Sources: DSOGaming, WCCFTech
Add your own comment

196 Comments on Lack of Async Compute on Maxwell Makes AMD GCN Better Prepared for DirectX 12

#101
rtwjunkie
PC Gaming Enthusiast
buggalugs
But the thing is, DX12 is going to take off faster than any other iteration. WIth the big performance gains of DX12, and free windows 10, DX 12 games are going to be everywhere soon. Pascal is going to be at least 6 months away, after Christmas when no doubt some big DX12 games will be released. .......and AMD has priority for HBM2 so this is going to hurt nvidia.

Those people that spent top dollar on highend Nvidia cards recently are going to be disappointed over the coming months as new games are released.

I dont understand why some people are still defending Nvidia, just like they did with the 3.5 GB debacle, nvidia has been dishonest here, if the game developer has gone public, Nvidia must have been assholes by trying to force him to disable the function. They wanted to keep the consumer in the dark.

No wonder Nvidia keeps on pulling stuff like this, their fanboys will always defend them, or maybe Nvidia pays a bunch of people to troll the forums and defend them, it wouldnt surprise me. haha
What big games will I be missing out on in the next 6 months? All the big titles announced so far into next year are DX11. I really don't see the crazy fast DX 12 adoption rate yet.
Posted on Reply
#103
Xzibit
rvalencia
Against that small amateur beyond3d latency benchmark. refer to https://www.reddit.com/r/nvidia/comments/3i6dks/maxwell_cant_do_vr_well_an_issue_of_latency/
latency numbers done by the professionals.
Not only that the person who put the story up isn't convinced.
Edit - Some additional info
This program is created by an amateur developer (this is literally his first DX12 program) and there is not consensus in the thread. In fact, a post points out that due to the workload (1 large enqueue operation) the GCN benches are actually running "serial" too (which could explain the strange ~40-50ms overhead on GCN for pure compute). So who knows if v2 of this test is really a good async compute test?
Posted on Reply
#104
EarthDog
buggalugs
But the thing is, DX12 is going to take off faster than any other iteration. WIth the big performance gains of DX12, and free windows 10, DX 12 games are going to be everywhere soon. Pascal is going to be at least 6 months away, after Christmas when no doubt some big DX12 games will be released. .......and AMD has priority for HBM2 so this is going to hurt nvidia.

Those people that spent top dollar on highend Nvidia cards recently are going to be disappointed over the coming months as new games are released.

I dont understand why some people are still defending Nvidia, just like they did with the 3.5 GB debacle, nvidia has been dishonest here, if the game developer has gone public, Nvidia must have been assholes by trying to force him to disable the function. They wanted to keep the consumer in the dark.

No wonder Nvidia keeps on pulling stuff like this, their fanboys will always defend them, or maybe Nvidia pays a bunch of people to troll the forums and defend them, it wouldnt surprise me. haha
You have seen one title perform well in DX12. Outside from that, what other DX12 games are said to be here in the next few months? Perhaps I missed something?

Also, can you link to something that states AMD has priority for HBM2?
Posted on Reply
#105
HumanSmoke
buggalugs
But the thing is, DX12 is going to take off faster than any other iteration. WIth the big performance gains of DX12, and free windows 10, DX 12 games are going to be everywhere soon.
Doubtful. Game developers aren't that energetic. The announced list of DX12 games is actually pretty short...and not every DX12 game uses the same resources - which should be fairly obvious, any more than every DX11 game is identical in its feature set. I'm betting Gears of War Ultimate won't be another AotS, or Fable Legends for that matter
buggalugs
Pascal is going to be at least 6 months away, after Christmas when no doubt some big DX12 games will be released. .......and AMD has priority for HBM2 so this is going to hurt nvidia.
So AMD are going to buy up all the HBM to piss on Nvidia's chips? So, after buying up all SK Hynix's HBM production, how much are they going to borrow to buy up all of Samsung's HBM ? AMD to do a full Nelson Bunker Hunt (substitute memory IC's for silver of course) ! buggalugs for AMD CFO.
Posted on Reply
#106
64K
EarthDog
You have seen one title perform well in DX12. Outside from that, what other DX12 games are said to be here in the next few months? Perhaps I missed something?

Also, can you link to something that states AMD has priority for HBM2?
Around mid July a lot of tech sites started reporting that AMD was rumored to have priority access to HBM2. There's a lot of articles spreading this rumor. wccftech seems to be the origin of the rumor so there's that to consider.
Posted on Reply
#107
HumanSmoke
64K
wccftech seems to be the origin of the rumor so there's that to consider.
I try not to consider WTFtech in any way, shape, or form. Seems like one of those clickbait sites whose links always start with "YOU WILL NEVER BELIEVE..." The comments section seems to be where mental retardation goes to get refresher courses.
Posted on Reply
#108
rvalencia
HumanSmoke
Doubtful. Game developers aren't that energetic. The announced list of DX12 games is actually pretty short...and not every DX12 game uses the same resources - which should be fairly obvious, any more than every DX11 game is identical in its feature set. I'm betting Gears of War Ultimate won't be another AotS, or Fable Legends for that matter
Gears of War Ultimate is a remaster from existing games while Fable Legends DX12 is new game and it was stated to use Async shaders. With DX12, it depends how many independent objects they throw on the screen e.g wide scale destructive physics with it's own individual light source would be similar.

HumanSmoke
I try not to consider WTFtech in any way, shape, or form. Seems like one of those clickbait sites whose links always start with "YOU WILL NEVER BELIEVE..." The comments section seems to be where mental retardation goes to get refresher courses.
The only mental retardation is you. On behalf of posters in the comments section who can't reply against you. I'll take you on. You started the personality based attacks, I'll gladly continue it.

As posted earlier in this thread, the original post was from Oxide i.e. read the full post from
http://www.overclock.net/t/1569897/various-ashes-of-the-singularity-dx12-benchmarks/1200#post_24356995
Posted on Reply
#110
the54thvoid
rvalencia
Gears of War Ultimate is a remaster from existing games while Fable Legends DX12 is new game and it was stated to use Async shaders. With DX12, it depends how many independent objects they throw on the screen e.g wide scale destructive physics with it's own individual light source would be similar.


The only mental retardation is you. On behalf of posters in the comments section who can't reply against you. I'll take you on. You started the personality based attacks, I'll gladly continue it.

As posted earlier in this thread, the original post was from Oxide i.e. read the full post from
http://www.overclock.net/t/1569897/various-ashes-of-the-singularity-dx12-benchmarks/1200#post_24356995
Nah, WCCFTECH comments section is pretty bestial, sorry dude. Makes the worst of TPU look civilised in comparison.
Posted on Reply
#111
cyneater
Meh who cares?

Typical hype bleeding edge crap anyway...

No one seems to remember the geforce FX and how they where not fully DX9 cards...
By the time games started using DX9 everyone had a geforce 6XXX or 7XXX.

DX11 came out in 2010?
How many games now use DX 11? A few more but it took a few years....

On top of that ...
Hopefully steam pulls there head out of there arse and makes steam os decent.... And developers start developing for linux.

As I don't like some of the features in windows 10....
Posted on Reply
#112
Ikaruga
rvalencia
XBO is the baseline DirectX12 GPU and it has two ACE units with 8 queues per unit as per Radeon HD 7790 (GCN 1.1).

The older GCN 1.0 still has two ACE units with 2 queues per unit but it's less capable than GCN 1.1.

GCN 1.0 such as 7970/R9-280X is still better than Fermi and Kelper in concurrent Async+Render category.

I did not contradict anything in your post addressed to me, but hey, it's a great subject:

This is not how it works, we are talking about performance in games, and not features or a tech-demo like engines abusing a single api feature. Think about the performance impact and usage of dx11.1 or 11.2 in games, because I can't recall if it ever made a huge impact in any game, and I read everything from dev documents through beyond3d to reddit or this forum. There are features which Intel does the best, and one could write a program which would abuse such feature (let's say ROV) to beat both NV and AMD, yet nobody in their right mind would think that Intel has a chance in games against the big boys. On the subject, Kepler and Fermi might be slower indeed, but don't forget that they had a wider bus which could come handy for them to keep a bit up with the younger chips, I think they won't be that bad with dx12 titles, but we will see.
Posted on Reply
#113
Sony Xperia S
cyneater
DX11 came out in 2010?
How many games now use DX 11? A few more but it took a few years....
Yeah, sure, I guess nvidia is one of the main reasons and causes for this stupid inconvenience.

AMD came with DX10.1 back then and what's happened ?!
Our "pigs in the mud" nvidia damaged its progress as well.

When they throw so much money so some titles run better on their hardware, why didn't they even a single time think of the option to make some progress, not to spoil all customers' experience ?!
Posted on Reply
#114
64K
Sony Xperia S
AMD came with DX10.1 back then and what's happened ?!
Our "pigs in the mud" nvidia damaged its progress as well.
Is this what you are referring too?

http://www.anandtech.com/show/2549/7
Posted on Reply
#115
Ikaruga
Sony Xperia S
Yeah, sure, I guess nvidia is one of the main reasons and causes for this stupid inconvenience.

AMD came with DX10.1 back then and what's happened ?!
Our "pigs in the mud" nvidia damaged its progress as well.

When they throw so much money so some titles run better on their hardware, why didn't they even a single time think of the option to make some progress, not to spoil all customers' experience ?!
No offense but do you even realize that all you do is whining... Evil Nvidia did this, evil Nvidia did that... Man the *!#& up!, and stop blaming Nvidia about consoles ruining PC gaming for almost a decade now. Nvidia might be a business monster indeed, I can agree with that, but at least they provide great progression, and they didn't win their market-share lead or their money on the lottery, they actually provided great products to the "costumers". I'm one of those costumers and my only complain to Nvidia is the high price they ask for their products, and other than that, I'm very satisfied. I wish all the best to AMD, if they will have better products I will buy those, but I'm pretty happy with my Maxwell atm, it runs everything smooth and fluid thanks.
Posted on Reply
#116
FordGT90Concept
"I go fast!1!11!1!"
HumanSmoke
So AMD are going to buy up all the HBM to piss on Nvidia's chips? So, after buying up all SK Hynix's HBM production, how much are they going to borrow to buy up all of Samsung's HBM ? AMD to do a full Nelson Bunker Hunt (substitute memory IC's for silver of course) ! buggalugs for AMD CFO.

I don't know about that but it does seem to confirm the theory that Fiji is AWOL because HBM chips are in very short supply until next year. AMD may have gotten the Async Compute right but their decision to adapt HBM now...doesn't seem like it was a good one. They should have pulled a Skylake putting a GDDR5 + HBM memory controller on Fiji so they could sell bulk orders of Fiji with GDDR5 and sell HBM models as available. They really shot themselves in the foot by not leaving GDDR5 as an option.


As for HBM-2, I have a sneaking suspicion there will be no mass production of HBM and will only be mass production of HBM-2. AMD is the only client for HBM, no? Unless SK Hynix can get more buyers, ramping up production doesn't make much sense. The bulk of memory orders are still DDR3, DDR3L, DDR4, and GDDR5. I wonder what SK Hynix said to AMD to get them to sign up. HBM-2 may be amazing but HBM seems like more trouble than its worth.
Posted on Reply
#117
Sony Xperia S
Ikaruga
I wish all the best to AMD, if they will have better products I will buy those
AMD have always had THE BETTER products (even though that stupid metric FPS might show different) but you are blind to appreciate.

I am going to buy for my friends the R9 280 for 168 euros now.
Posted on Reply
#118
Ikaruga
Sony Xperia S
AMD have always had THE BETTER products (even though that stupid metric FPS might show different) but you are blind to appreciate.

I am going to buy for my friends the R9 280 for 168 euros now.
OK! I'm happy for you, enjoy your new card!
Posted on Reply
#119
Sony Xperia S
Ikaruga
OK! I'm happy for you, enjoy your new card!
I don't need that you be happy for me. Just be honest in front of yourself and the justice in this world.

And it won't be my new card - I will recommend and buy several R9 280 because there is nothing better, for my friends. ;)
Posted on Reply
#120
Ikaruga
Sony Xperia S
I don't need that you be happy for me. Just be honest in front of yourself and the justice in this world.

And it won't be my new card - I will recommend and buy several R9 280 because there is nothing better, for my friends. ;)
l'm always honest. Nvidia is the Apple of GPUs, they are evil, they are greedy, there is almost nothing you could like about them, but they make very good stuff, which works well and also performs well, so they win. If they would suck, nobody would buy their products for decades, the GPU market is not like the music industry, only a very small tech savy percentage of the population buys dedicated GPUs, no Justin Biebers can keep themselves on the surface for a long time without actually delivering good stuff.
Posted on Reply
#121
EarthDog
Sony Xperia S
AMD have always had THE BETTER products (even though that stupid metric FPS might show different) but you are blind to appreciate.

I am going to buy for my friends the R9 280 for 168 euros now.
I really hate to feed the nonsensical but, I wonder how you define better...

It can't be in performance /watt...
It can't be in frame time in CFx v SLI...
It can't be in highest FPS/performance...

Bang for your buck? CHECK.
Utilizing technology to get (TOO FAR) ahead of the curve? CHECK.

I'm spent.
Posted on Reply
#122
FordGT90Concept
"I go fast!1!11!1!"
Sony Xperia S
I am going to buy for my friends the R9 280 for 168 euros now.
Beware, I'm hearing about problems with R9 280(X) from all over the place. Specifically, Gigabyte and XFX come up.
Posted on Reply
#123
HumanSmoke
FordGT90Concept
As for HBM-2, I have a sneaking suspicion there will be no mass production of HBM and will only be mass production of HBM-2.
Well the Samsung HBM is second generation. Note the density and bandwidth in the slide I posted earlier.
FordGT90Concept
AMD is the only client for HBM, no?
That seems to be the case. Everyone else seems to be waiting for the technology to become more viable. Higher density of HBM2 means less stacks per given capacity >>> smaller interposer required >>> lower production cost and defect rate with lower pin-out. Waiting for HBM2 also means that the memory vendors get more experience with the process, and interposer packaging becomes a more mature process - so a higher production ramp. With SK Hynix the only source initially, I doubt many would have jumped on board. Most IHV's, AIB/AIC would wait for a second source of supply to maintain supply in the eventuality that Hynix couldn't/wasn't able to meet orders
FordGT90Concept
Unless SK Hynix can get more buyers, ramping up production doesn't make much sense.
I get the distinct impression that HBM1 was a proof of concept exercise.
FordGT90Concept
The bulk of memory orders are still DDR3, DDR3L, DDR4, and GDDR5. I wonder what SK Hynix said to AMD to get them to sign up. HBM-2 may be amazing but HBM seems like more trouble than its worth.
Well, someone had to get the ball rolling and take a hit in the short term to ensure a long term future. Waiting, waiting, waiting for HBM2 to launch before products ship may have bigger implications for SK Hynix ( proof of concept with HBM1 might have been required to get other vendors onboard with HBM2). HBM1 is in all likelihood a small risk for Hynix given the volumes of the other memory you quoted - all the major risk would be assumed by AMD, since without HBM and Fiji having no GDDR5 memory controllers, Fiji would be stillborn.
rvalencia
Gears of War Ultimate is a remaster from existing games while Fable Legends DX12 is new game and it was stated to use Async shaders. With DX12, it depends how many independent objects they throw on the screen
[quote=HumanSmoke]he announced list of DX12 games is actually pretty short...and not every DX12 game uses the same resources
[/quote]So, basically what I just said.
rvalencia
On behalf of posters in the comments section who can't reply against you. I'll take you on. You started the personality based attacks, I'll gladly continue it.
Don't kid yourself, they only reasons they aren't here is because they're too busy eating their crayons.
bon appétit
the54thvoid
Nah, WCCFTECH comments section is pretty bestial, sorry dude. Makes the worst of TPU look civilised in comparison.
I think rvalencia is attempting to bridge that divide.
Posted on Reply
#124
FordGT90Concept
"I go fast!1!11!1!"
HumanSmoke
Well, someone had to get the ball rolling and take a hit in the short term to ensure a long term future. Waiting, waiting, waiting for HBM2 to launch before products ship may have bigger implications for SK Hynix ( proof of concept with HBM1 might have been required to get other vendors onboard with HBM2). HBM1 is in all likelihood a small risk for Hynix given the volumes of the other memory you quoted - all the major risk would be assumed by AMD, since without HBM and Fiji having no GDDR5 memory controllers, Fiji would be stillborn.
Indeed but AMD is the worst company in the world to be taking a gamble like that. I think the console market had more to do with AMD's decision than discreet GPUs. Fiji is just their viability test platform. Maybe AMD excepts to ship second generation APUs for Xbox One and PlayStation 4 with die shrink and HBM and AMD expects to be able to pocket the savings instead of Sony and Microsoft.
Posted on Reply
#125
HumanSmoke
FordGT90Concept
Indeed but AMD is the worst company in the world to be taking a gamble like that. I think the console market had more to do with AMD's decision than discreet GPUs.
Possible, but the console APUs are Sony/MS turf. AMD is the designer. Any deviation in development ultimately is Sony/MS's decision
FordGT90Concept
Fiji is just their viability test platform.
A test platform that represents AMD's only new GPU in the last year (and for the next six months at least). Without Fiji, AMD's lineup is straight up rebrands with some mildly warmed over SKUs added into the mix, whose top model is the 390X (and presumably without Fiji, there would be a 395X2). Maybe not a huge gulf in outright performance, but from a marketing angle AMD would get skinned alive. Their market share without Fiji was in a nose dive.
FordGT90Concept
Maybe AMD excepts to ship second generation APUs for Xbox One and PlayStation 4 with die shrink and HBM and AMD expects to be able to pocket the savings instead of Sony and Microsoft.
Devinder Kumar intimated that the APU die shrink would mean AMD's net profit would rise, so that is a fair assumption that any saving in manufacturing cost aids AMD, but even with the APU die and packaging shrink, Kumar expected gross margins to break $20/unit form the $17-18 they are presently residing at. Console APUs are still a volume commodity product, and I doubt that Sony/MS would tolerate any delivery slippage due to process/package deviation unless the processes involved were rock solid - especially if the monetary savings are going into AMD's pocket rather than the risk/reward being shared.
Posted on Reply
Add your own comment