Lack of Async Compute on Maxwell Makes AMD GCN Better Prepared for DirectX 12

Aquinus · Sep 2, 2015

Sony Xperia S said:
What are you speaking about ?? What did I remove and what am I trying to change ?

This is the line which introduced apple:

He said nVidia is the Apple of. He wasn't talking about Apple. He went on to say (talking about nVidia):

Ikaruga said:
they are evil, they are greedy, there is almost nothing you could like about them, but they make very good stuff, which works well and also performs well, so they win. If they would suck, nobody would buy their products for decades, the GPU market is not like the music industry, only a very small tech savy percentage of the population buys dedicated GPUs

Your only digging yourself a deeper hole...

Sony Xperia S · Sep 2, 2015

Aquinus said:
He said nVidia is the Apple of. He wasn't talking about Apple. He went on to say (talking about nVidia)

Really, I never knew and actually don't wanna know that this fruit the apple is so divine. :laugh:

Seriously, how would I have known that ? When this is the first time I hear someone speaking like that ?

Aquinus · Sep 2, 2015

Sony Xperia S said:
Really, I never knew and actually don't wanna know that this fruit the apple is so divine.

Seriously, how would I have known that ? When this is the first time I hear someone speaking like that ?

Then maybe you should learn to read before making assumptions about what people are saying. Considering this is not a thread about Apple, you should have been able to put one and one together to make two.

Remember how I said:

Aquinus said:
Your only digging yourself a deeper hole...

Aquinus said:
This is the nice way of me telling you to shut up and stop posting bullshit but, it appears that I needed to spell that out for you.

Well, that all still stands and is only even more relevant now.

Sony Xperia S · Sep 2, 2015

Remember how I said:

I am not in a hole, and I don't understand what exactly you are speaking about and how in hell you know what that person meant?

Are you threatening me or what?

My reading skills are ok. I am reading.
What I want to kindly ask you is to leave me alone without all the time analysing in a very negative way my posts and instead trying to respect my opinion.

Aquinus · Sep 2, 2015

Sony Xperia S said:
I am not in a hole, and I don't understand what exactly you are speaking about and how in hell you know what that person meant?

I can read. It doesn't take a rocket scientist to figure out what he was saying.

Sony Xperia S said:
My reading skills are ok. I am reading.

Then there is nothing further to discuss because I know English and I understood him just fine.

Sony Xperia S said:
Are you threatening me or what?

No, just pointing out that you've been pulling the thread off topic because you didn't understand what someone else posted.

Sony Xperia S said:
What I want to kindly ask you is to leave me alone without all the time analysing in a very negative way my posts and instead trying to respect my opinion.

Then maybe you should stay on topic like I said in the first place. If you want to be left alone, a public forum is not the place to be. Calling you out on BS is not persecution, it's called accountability.

Captain_Tom · Sep 2, 2015

RejZoR said:
I think NVIDIA just couldn't be bothered with driver implementation till now because frankly, async compute units weren't really needed till now (or shall I say till DX12 games are here). Maybe drivers "bluff" the support just to prevent crashing if someone happens to try and use it now, but they'll implement it at later time properly. Until NVIDIA confirms that GTX 900 series have no async units, I call it BS.

Yeah and how long did it take them to admit the 970 has 3.5GB of VRAM? Heck they still haven't fully fessed up to it.

cadaveca · Sep 2, 2015

Uh ,FYI guys, on reddit is a thread with an apparent AMD guy saying that NO GPU ON THE MARKET TODAY is fully DX12 compliant. So...

What AMD does, NVidia doesn't. Also, vice versa.

Now, about that rumour that you could use NVidia and AMD GPUs together in the same system... would that somehow overcome these "issues"?

EarthDog · Sep 2, 2015

It has 4GB of vram though... its just that the last .5GB are much slower.

EarthDog · Sep 2, 2015

Sony Xperia S said:
Anyways, you guys are so mean. I can't comprehend how it's even possible that such people exist.

Yes, and Image quality CHECK.

Mean? How are we mean when we(I) shower you with facts? I like how you cherry pick the two good things (it was one actually) I mentioned, but completely disregard the rest yet still think its better.

Image quality? You need to prove that Sony...

You have your head shoved so far up AMD's ass you are crapping AMD BS human caterpillar style (THAT was the first mean thing I have said) and you don't even know it. Since TPU doesn't seem to want to perma ban this clown, I'm just going to put him on ignore. Have fun with this guy people. I can't take the nonsense anymore and risk getting in trouble myself.

RejZoR · Sep 2, 2015

Interesting read on Async Shaders and GTX 900 series:
https://www.reddit.com/r/nvidia/comments/3j5e9b/analysis_async_compute_is_it_true_nvidia_cant_do/

EDIT:
Removed the info, read it yourself...

Frick · Sep 2, 2015

RejZoR said:
Interesting read on Async Shaders and GTX 900 series:
https://www.reddit.com/r/nvidia/comments/3j5e9b/analysis_async_compute_is_it_true_nvidia_cant_do/

EDIT:
Removed the info, read it yourself...

tl;dr "we still don't know lol"

RejZoR · Sep 2, 2015

Well, from what I can see so far, NVIDIA is capable of doing async compute, just more limited by the queue scheduler. Still need to read further...

FordGT90Concept · Sep 2, 2015

Sony Xperia S said:
That's called brainwashing. I have never seen any technological competetive advantages in apple's products compared to the competition. Actually, the opposite - they break like shit.

For the record, I just installed my PowerColor PCS+ 290X yesterday and first impressions are excellent. FurMark (100% load) only took it to 64C.

ValenOne · Sep 2, 2015

RejZoR said:
Well, from what I can see so far, NVIDIA is capable of doing async compute, just more limited by the queue scheduler. Still need to read further...

Maxwellv2 is not capable of concurrent async + rendering without incurring context penalties and it's under this context that Oxdie made it's remarks.

cadaveca said:

Uh ,FYI guys, on reddit is a thread with an apparent AMD guy saying that NO GPU ON THE MARKET TODAY is fully DX12 compliant. So...

Click to expand...

cadaveca said:
What AMD does, NVidia doesn't. Also, vice versa.

Now, about that rumour that you could use NVidia and AMD GPUs together in the same system... would that somehow overcome these "issues"?

Intel Xeon 18 CPU core per socket running DirectX12 reference driver is the full DirectX12 renderer.

FordGT90Concept · Sep 2, 2015

Direct3D has feature levels. 12.0 is basic DirectX 12 support which AMD GCN, Intel's iGPU, and NVIDIA all support. Maxwell has 12.1 support officially meaning the cards won't freak out if they see 12.1 instructions but all NVIDIA cards that support 12.0 will take a performance penalty when the software uses async compute. It supports it but it does a really bad job at supporting it.

I'm curious if Intel's iGPU takes a performance penalty when using async compute too.

Sony Xperia S · Sep 2, 2015

FordGT90Concept said:
For the record, I just installed my PowerColor PCS+ 290X yesterday and first impressions are excellent. FurMark (100% load) only took it to 64C.

Excellent news ! It is great to hear you have a new card.

Why do you stress her under FurMark ?

FordGT90Concept · Sep 2, 2015

Make sure it is stable and the temperatures are reasonable. I'm only keeping it installed for about a week then I'm going back to 5870 until I can get my hands on a 6700K. I need to make sure I don't have to RMA it.

the54thvoid · Sep 2, 2015

Aquinus said:
I can read. It doesn't take a rocket scientist to figure out what he was saying.

Then there is nothing further to discuss because I know English and I understood him just fine.

No, just pointing out that you've been pulling the thread off topic because you didn't understand what someone else posted.

Then maybe you should stay on topic like I said in the first place. If you want to be left alone, a public forum is not the place to be. Calling you out on BS is not persecution, it's called accountability.

lol, I thought "why the hell is @Aquinus triple posting and wtf are these people talking about" then realised - ah it's him. I can't see their posts - still blocked to me - thankfully it seems. I agree with @EarthDog though - should be banned - simple as that.

BiggieShady · Sep 2, 2015

rvalencia said:
Maxwellv2 is not capable of concurrent async + rendering without incurring context penalties and it's under this context that Oxdie made it's remarks.

That is a claim presented at the beginning of the article. Through the end, if you read it, it is proven in benchmark that it is not true (number of queues horizontally and time spent computing vertically - lower is better)

Maxwell is faster than GCN up to 32 queues, and it evens out with GCN to 128 queues, where GCN has same speed up to 128 queues.
It's also shown that with async shaders it's extremely important how they are compiled for each architecture.
Good find @RejZoR

FordGT90Concept · Sep 2, 2015

Fermi and newer apparently can handle 31 async commands (jumps up at 32, 64, 96, 128) before the scheduler freaks out. GCN can handle 64 at which point it starts straining. GCN can handle far more async commands than Fermi and newer.

The question is how does this translate to the real world? How many async commands is your average game going to use? 31 or less? 1000s?

Ikaruga · Sep 2, 2015

cadaveca said:
Uh ,FYI guys, on reddit is a thread with an apparent AMD guy saying that NO GPU ON THE MARKET TODAY is fully DX12 compliant. So...

What AMD does, NVidia doesn't. Also, vice versa.

This what I said in this thread a day earlier than that reddit post, but some people are in write only mode and don't actually read what others are saying.

Ikaruga said:
There is no misinformation at all, most of the dx12 features will be supported by software on most of the cards, there are no GPU on the market with 100% top tier dx12 support (and I'm not sure if the next generation will be one, but maybe). This is nothing but a very well directed market campaign to level the fields, but I expected more insight into this from some of the TPU vets tbh (I don't mind it btw, AMD needs all the help he can get anyways).

BiggieShady · Sep 2, 2015

FordGT90Concept said:
The question is how does this translate to the real world? How many async commands is your average game going to use? 31 or less? 1000s?

The answer to that question is same as the answer to this question: How many different kinds of parallel tasks beside graphics can you imagine in game? Let's say that you don't want to animate leaves in the forest using only geometry shaders but you want real global wind simulation, and you use compute shader for that. Next you want wind on the water geometry, do you go with new async compute shader or append to existing one? As you can see the real world number for simultaneous async compute shaders is how many different kinds of simulations are we going to use: hair, fluids, rigid bodies, custom GPU accelerated AI ... all that would benefit from being in different async shader each, rather than having huge shader with bunch of branching (no branch prediction in gpu cores, even worse - gpu cores almost always execute both if-else paths)
All in all I'd say 32 is more than enough for gaming ... there might be benefit of more in pure compute

Ikaruga said:
most of the dx12 features will be supported by software on most of the cards, there are no GPU on the market with 100% top tier dx12 support (and I'm not sure if the next generation will be one, but maybe)

Point is good and all but let's not forget how we are here (mostly) very well used to difference between marketing badges on colorful boxes and spotty support for a new API. Major game engine developers will find a well supported feature subset on both architectures and use them ... hopefully every major engine will have optimized code paths for each architecture and automatic fallback to DX11. Let's try keeping out fingers crossed for a couple of years.

In august adoption of win10 with dx12 gpu owners went from 0% to 16.32% ... hmm ... using full blown dx12 features - maybe in a year

truth teller · Sep 2, 2015

the info on that reddit thread is not really the truth.

the main goal of having async compute units is not the major parallelization of workload, but having the gpu compute said workload while still performing rendering tasks, which nvidia hardware can't do (all the news floating around seeems to indicate so, also the company hasnt addressed the issue in any way so that pretty much admitting fault).

leave that reddit guy with its c source file with cuda preprocessor tags alone, its going nowhere

Ikaruga · Sep 3, 2015

BiggieShady said:
Point is good and all but let's not forget how we are here (mostly) very well used to difference between marketing badges on colorful boxes and spotty support for a new API. Major game engine developers will find a well supported feature subset on both architectures and use them ... hopefully every major engine will have optimized code paths for each architecture and automatic fallback to DX11. Let's try keeping out fingers crossed for a couple of years.

In august adoption of win10 with dx12 gpu owners went from 0% to 16.32% ... hmm ... using full blown dx12 features - maybe in a year

I agree and did not forget at all, but my conclusion were a little different, as I wrote it somewhere here earlier. Anyways, most of the Multiplatform games will mostly likely pick the features which are fast and available on the major consoles, but this it not the end of the world, you will still be able to play those games, perhaps you will need to set 1-2 sliders to high instead of ultra in the options to get optimal performance. Other titles might use gameworks or even a more direct approach exclusive to the PC, and those will run better on Nvidia and on AMD depending on the path they will take.

truth teller said:
the info on that reddit thread is not really the truth.

the main goal of having async compute units is not the major parallelization of workload, but having the gpu compute said workload while still performing rendering tasks, which nvidia hardware can't do (all the news floating around seeems to indicate so, also the company hasnt addressed the issue in any way so that pretty much admitting fault).

I don't think that's correct. Nvidia has a disadvantage with async compute on the hardware level indeed, but we don't know the performance impact if that's properly gets corrected with the help of the driver/CPU (and properly means well optimized here), and there are other features what the Nvidia architecture does faster, and engines using those might easily gain back what they lost with the async compute part.

We just don't know yet.

eidairaman1 · Sep 3, 2015

NV been cons for so long

System Name	Apollo
Processor	Intel Core i9 9880H
Motherboard	Some proprietary Apple thing.
Memory	64GB DDR4-2667
Video Card(s)	AMD Radeon Pro 5600M, 8GB HBM2
Storage	1TB Apple NVMe, 2TB external SSD, 4TB external HDD for backup.
Display(s)	32" Dell UHD, 27" LG UHD, 28" LG 5k
Case	MacBook Pro (16", 2019)
Audio Device(s)	AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply	Display or Thunderbolt 4 Hub
Mouse	Logitech G502
Keyboard	Logitech G915, GL Clicky
Software	MacOS 15.3.1

Processor	AMD FX-8320
Motherboard	AsRock 970 PRO3 R2.0
Cooling	Thermalright Ultra120 eXtreme + 2 LED Green fans
Memory	2 x 4096 MB DDR3-1333 A-Data
Video Card(s)	SAPPHIRE 4096M R9 FURY X 4G D5
Storage	ST1000VX000 • SV35.6 Series™ 1000 GB 7200 rpm
Display(s)	Acer S277HK wmidpp 27" 4K (3840 x 2160) IPS
Case	Cooler Master HAF 912 Plus Black + Red Lights
Audio Device(s)	Onboard Realtek
Power Supply	OCZ ProXStream 1000W
Mouse	Genius NetScroll 100X
Keyboard	Logitech Wave
Software	Windows 7 Ultimate 64-bit

System Name	Apollo
Processor	Intel Core i9 9880H
Motherboard	Some proprietary Apple thing.
Memory	64GB DDR4-2667
Video Card(s)	AMD Radeon Pro 5600M, 8GB HBM2
Storage	1TB Apple NVMe, 2TB external SSD, 4TB external HDD for backup.
Display(s)	32" Dell UHD, 27" LG UHD, 28" LG 5k
Case	MacBook Pro (16", 2019)
Audio Device(s)	AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply	Display or Thunderbolt 4 Hub
Mouse	Logitech G502
Keyboard	Logitech G915, GL Clicky
Software	MacOS 15.3.1

Processor	AMD FX-8320
Motherboard	AsRock 970 PRO3 R2.0
Cooling	Thermalright Ultra120 eXtreme + 2 LED Green fans
Memory	2 x 4096 MB DDR3-1333 A-Data
Video Card(s)	SAPPHIRE 4096M R9 FURY X 4G D5
Storage	ST1000VX000 • SV35.6 Series™ 1000 GB 7200 rpm
Display(s)	Acer S277HK wmidpp 27" 4K (3840 x 2160) IPS
Case	Cooler Master HAF 912 Plus Black + Red Lights
Audio Device(s)	Onboard Realtek
Power Supply	OCZ ProXStream 1000W
Mouse	Genius NetScroll 100X
Keyboard	Logitech Wave
Software	Windows 7 Ultimate 64-bit

System Name	Apollo
Processor	Intel Core i9 9880H
Motherboard	Some proprietary Apple thing.
Memory	64GB DDR4-2667
Video Card(s)	AMD Radeon Pro 5600M, 8GB HBM2
Storage	1TB Apple NVMe, 2TB external SSD, 4TB external HDD for backup.
Display(s)	32" Dell UHD, 27" LG UHD, 28" LG 5k
Case	MacBook Pro (16", 2019)
Audio Device(s)	AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply	Display or Thunderbolt 4 Hub
Mouse	Logitech G502
Keyboard	Logitech G915, GL Clicky
Software	MacOS 15.3.1

Lack of Async Compute on Maxwell Makes AMD GCN Better Prepared for DirectX 12

Aquinus

Resident Wat-man

Sony Xperia S

Aquinus

Resident Wat-man

Sony Xperia S

Aquinus

Resident Wat-man

Captain_Tom

cadaveca

My name is Dave

EarthDog

EarthDog

RejZoR

Frick

Fishfaced Nincompoop

RejZoR

FordGT90Concept

"I go fast!1!11!1!"

ValenOne

FordGT90Concept

"I go fast!1!11!1!"

Sony Xperia S

FordGT90Concept

"I go fast!1!11!1!"

the54thvoid

Super Intoxicated Moderator

BiggieShady

FordGT90Concept

"I go fast!1!11!1!"

Ikaruga

BiggieShady

truth teller

Ikaruga

eidairaman1

The Exiled Airman

System Name	Dark Monolith
Processor	AMD Ryzen 7 5800X3D
Motherboard	ASUS Strix X570-E
Cooling	Arctic Cooling Freezer II 240mm + 2x SilentWings 3 120mm
Memory	64 GB G.Skill Ripjaws V Black 3600 MHz
Video Card(s)	XFX Radeon RX 9070 XT Mercury OC Magnetic Air
Storage	Seagate Firecuda 530 4 TB SSD + Samsung 850 Pro 2 TB SSD + Seagate Barracuda 8 TB HDD
Display(s)	ASUS ROG Swift PG27AQDM 240Hz OLED
Case	Silverstone Kublai KL-07
Audio Device(s)	Sound Blaster AE-9 MUSES Edition + Altec Lansing MX5021 2.1 Nichicon Gold
Power Supply	BeQuiet DarkPower 11 Pro 750W
Mouse	Logitech G502 Proteus Spectrum
Keyboard	UVI Pride MechaOptical
Software	Windows 11 Pro

System Name	Black MC in Tokyo
Processor	Ryzen 5 7600
Motherboard	MSI X670E Gaming Plus Wifi
Cooling	Be Quiet! Pure Rock 2
Memory	2 x 16GB Corsair Vengeance @ 6000Mhz
Video Card(s)	XFX 6950XT Speedster MERC 319
Storage	Kingston KC3000 1TB \| WD Black SN750 2TB \|WD Blue 1TB x 2 \| Toshiba P300 2TB \| Seagate Expansion 8TB
Display(s)	Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case	Fractal Design Define R4
Audio Device(s)	Plantronics 5220, Nektar SE61 keyboard
Power Supply	Corsair RM850x v3
Mouse	Logitech G602
Keyboard	Dell SK3205
Software	Windows 10 Pro
Benchmark Scores	Rimworld 4K ready!

System Name	BY-2021
Processor	AMD Ryzen 7 5800X (65w eco profile)
Motherboard	MSI B550 Gaming Plus
Cooling	Scythe Mugen (rev 5)
Memory	2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s)	AMD Radeon RX 7900 XT
Storage	Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s)	Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case	Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s)	Realtek ALC1150, Micca OriGen+
Power Supply	Enermax Platimax 850w
Mouse	Nixeus REVEL-X
Keyboard	Tesoro Excalibur
Software	Windows 10 Home 64-bit
Benchmark Scores	Faster than the tortoise; slower than the hare.

System Name	Eula
Processor	AMD Ryzen 9 7950X
Motherboard	MSI MPG B850 Edge Ti WiFi
Cooling	Corsair H150i Elite LCD XT White
Memory	Trident Z5 Neo RGB DDR5-6000 CL32-38-38-96 1.40V 64GB (2x32GB) AMD EXPO F5-6000J3238G32GX2-TZ5NR
Video Card(s)	Gigabyte GeForce RTX 4080 GAMING OC
Storage	Crucial P3 Plus, 4 TB NVMe, Samsung 980 Pro 2TB NVMe, Toshiba N300 10TB HDD, WDC Red Pro NAS HDD
Display(s)	Acer Predator X32FP 32in 160Hz 4K, Corsair Xeneon 32UHD144 32in 144 hz 4K
Case	Antec Constellation C8 RGB White
Audio Device(s)	Creative Sound Blaster Z
Power Supply	Corsair HX1000 Platinum 1000W
Mouse	SteelSeries Prime Pro Gaming Mouse
Keyboard	SteelSeries Apex 5
Software	MS Windows 11 Pro

Processor	Ryzen 7800X3D
Motherboard	MSI MAG Mortar B650 (wifi)
Cooling	be quiet! Dark Rock Pro 4
Memory	32GB Kingston Fury
Video Card(s)	MSI RTX 5080 Vanguard SOC
Storage	Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s)	LG 32" 165Hz 1440p GSYNC
Case	Asus Prime AP201
Audio Device(s)	On Board
Power Supply	be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software	W10

System Name	Windows 10 64-bit Core i7 6700
Processor	Intel Core i7 6700
Motherboard	Asus Z170M-PLUS
Cooling	Corsair AIO
Memory	2 x 8 GB Kingston DDR4 2666
Video Card(s)	Gigabyte NVIDIA GeForce GTX 1060 6GB
Storage	Western Digital Caviar Blue 1 TB, Seagate Baracuda 1 TB
Display(s)	Dell P2414H
Case	Corsair Carbide Air 540
Audio Device(s)	Realtek HD Audio
Power Supply	Corsair TX v2 650W
Mouse	Steelseries Sensei
Keyboard	CM Storm Quickfire Pro, Cherry MX Reds
Software	MS Windows 10 Pro 64-bit

System Name	PCGOD
Processor	AMD FX 8350@ 5.0GHz
Motherboard	Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling	Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory	16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s)	AMD Radeon 290 Sapphire Vapor-X
Storage	Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s)	NEC Multisync LCD 1700V (Display Port Adapter)
Case	AeroCool Xpredator Evil Blue Edition
Audio Device(s)	Creative Labs Sound Blaster ZxR
Power Supply	Seasonic 1250 XM2 Series (XP3)
Mouse	Roccat Kone XTD
Keyboard	Roccat Ryos MK Pro
Software	Windows 7 Pro 64