Anandtech Review Summerize Up Pascal's "Async Compute" Truth

jabbadap · Jul 21, 2016

RejZoR said:
RX 480 gaining what, 30% in Doom (Vulkan). I wouldn't call that "small".

Well that gain is not all because of async compute(gain from async on/off with vulkan is about 5-15% depending on system and place of the game). Doom is the first game to take advantage of gcn shader instrinsic(close to metal=console like) which really give them performance boost. Nvidia has no alternative as it uses glsl shaders for vulkan.

Steevo · Jul 21, 2016

http://www.hardocp.com/article/2016...x_1060_founders_edition_review/5#.V5D1JDW1UhQ

DX12 and Vulkan makes a RX480 perform like a 980Ti, which is essentially a GTX1070.

FordGT90Concept · Jul 21, 2016

jabbadap said:
Well that gain is not all because of async compute(gain from async on/off with vulkan is about 5-15% depending on system and place of the game). Doom is the first game to take advantage of gcn shader instrinsic(close to metal=console like) which really give them performance boost. Nvidia has no alternative as it uses glsl shaders for vulkan.

The boost in Vulkan comes from AMD's long frame render time using OpenGL. Vulkan brings AMD cards up where they should be when running OpenGL (we're talking something massive like 20-30% here). OpenGL to Vulkan should only get about a 5-10% performance boost. Async compute is also 5-10% on AMD cards depending on how much of the GPU is idle.

Vulkan fixes problems AMD never bothered to (or couldn't) address in OpenGL.

the54thvoid · Jul 21, 2016

Steevo said:
http://www.hardocp.com/article/2016...x_1060_founders_edition_review/5#.V5D1JDW1UhQ

DX12 and Vulkan makes a RX480 perform like a 980Ti, which is essentially a GTX1070.

We know that.

But that's not the point of the thread. But still, this thread is going to degenerate into another 'my band's bigger than yours' thread.

THE OP was talking about why Nvidia abandoned compute style hardware and AMD didn't. It's not about which is better - it's about why. And you're long enough in the tooth to know how much we AMD (sorry ATI) owners, back in the day slagged off Fermi for it's power consumption. Nvidia reacted to that, dropped a lot of compute and DP and went for frequency and efficiency over sheer hardware. AMD plugged away at hardware, (look at the transistor and shader counts) and you have a situation where yes, the RX 480 under Vulkan does well but it consumes nearly the same power as a GTX1070.
Moreover - that's one title. ONE title. Nvidia gambled on DX11 being around for a while longer.

All DX12 is doing is leveling the current hardware playing field (which is great). Choose AMD for DX12 specialisation (developer dependent) or choose Nvidia for the bulk of current DX11 (and future DX11 games). Hell, Pascal is fast enough that it still works with DX12 (just a fraction under the RX480 in Hitman - an AMD Async heavy game).

So please, let's not do a 'this card is better than that card' debate. It'd be great if it got more technical - like why @theoneandonlymrk should be folding on 390X's.

yogurt_21 · Jul 21, 2016

you've piqued my interest.

why is Fiji so bad at folding compared with Hawaii and Tahiti?

TheoneandonlyMrK · Jul 21, 2016

the54thvoid said:
We know that.

But that's not the point of the thread. But still, this thread is going to degenerate into another 'my band's bigger than yours' thread.

THE OP was talking about why Nvidia abandoned compute style hardware and AMD didn't. It's not about which is better - it's about why. And you're long enough in the tooth to know how much we AMD (sorry ATI) owners, back in the day slagged off Fermi for it's power consumption. Nvidia reacted to that, dropped a lot of compute and DP and went for frequency and efficiency over sheer hardware. AMD plugged away at hardware, (look at the transistor and shader counts) and you have a situation where yes, the RX 480 under Vulkan does well but it consumes nearly the same power as a GTX1070.
Moreover - that's one title. ONE title. Nvidia gambled on DX11 being around for a while longer.

All DX12 is doing is leveling the current hardware playing field (which is great). Choose AMD for DX12 specialisation (developer dependent) or choose Nvidia for the bulk of current DX11 (and future DX11 games). Hell, Pascal is fast enough that it still works with DX12 (just a fraction under the RX480 in Hitman - an AMD Async heavy game).

So please, let's not do a 'this card is better than that card' debate. It'd be great if it got more technical - like why @theoneandonlymrk should be folding on 390X's.

No I shouldn't and I would have if 700 watts applied to folding was what I was after, my rigs made for 3 gpus and so it Will be 1 day ,and it will use less than 700 watts for 3 cards doing more for less.
The 480 is on the same par in actual folding performance terms as a 390X using half the power though I'd agree on paper the 390X looks good ,folding @home isn't solely double precision Intense it seams.
Off topic though I apologise but I was mearly answering your question,its clear though that my cards lack of Aces has a cost.
That cost applies to nvidia more as they have no Ace equivalent, and their polymorph engine clearly is purely optimized for graphics work, nvidia stripped all the allegedly unnecessary bits out of its low to high end gpus to fit less but more robust shader arrays in place then clock the snot out of them, fair enough and for five years its not hurt them but times change and the next five will see a different approach from them I'm sure.
No dev in their right mind fully ignores compliant hardware across multiple formats so dx12 and vulkan Will gain support relatively quickly compared to dx11 imho as dx11 was sole platform based yet did get used eventually.
As for some comments regarding nvidia catching up to Amds tesselation performance,that depends on your outlook I suppose because to catchup to Amds 4th gen arch tesselation performance nvidia had to over egg it hence some of the Gtx480-580 heat ie nvidia use much more silicon real estate to get the job done then Amd ever will and the same will apply to async compute, nvidias next new architecture is going to knock some sideways with its lack of efficiency gains imho.

jabbadap · Jul 21, 2016

yogurt_21 said:
you've piqued my interest.

why is Fiji so bad at folding compared with Hawaii and Tahiti?

It's a double precision test. Fiji fp32/fp64 ratio is 16:1, Hawaii has ratio of 8:1 and Tahiti 4:1(for radeons, not firepros obviously). Thus the double precision numbers for the cards are Fiji:2*1.05*4096/16=537.6GFlops, Hawaii:2*1*2816/8=704Gflops and Tahiti:2*0.925*2048/4=947.4Gflops. So with looking pure shader power Tahiti should be on top on that test, but peak compute power is always hard to reach in real world applications.

little cat · Jul 21, 2016

Who is using double precision for gaming !?

jabbadap · Jul 21, 2016

FordGT90Concept said:
The boost in Vulkan comes from AMD's long frame render time using OpenGL. Vulkan brings AMD cards up where they should be when running OpenGL (we're talking something massive like 20-30% here). OpenGL to Vulkan should only get about a 5-10% performance boost. Async compute is also 5-10% on AMD cards depending on how much of the GPU is idle.

Vulkan fixes problems AMD never bothered to (or couldn't) address in OpenGL.

Well yeah of course you are right about that.

But Vulkan is low level API: You can do simple port which work with all and get that lower cpu overhead benefits(or in bad ports not). But the thing is you can still get much more out of it using vendor specific extensions(AMD:VK_AMD_rasterization_order, VK_AMD_shader_trinary_minmax, VK_AMD_shader_explicit_vertex_parameter, VK_AMD_gcn_shader Nvidia: VK_NV_glsl_shader, VK_NV_dedicated_allocation). Now amd gives you low level extensions of it's own architecture while nvidia gives you just it's higher level OpenGL abstraction. Until nvidia opens up it's own architecture, I would say amd will have upper hand on vulkan(if game developer uses amds extensions).

iD will reveal this years siggraph2016, which will start next sunday, more about doom vulkan and usage of gcn specific extensions:

The devil is in the details: idTech 666

Abstract: A behind-the-scenes look into the latest renderer technology powering the critically acclaimed DOOM. The lecture will cover how technology was designed for balancing a good visual quality and performance ratio. Numerous topics will be covered, among them details about the lighting solution, techniques for decoupling costs frequency and GCN specific approaches.

Presenters:
Tiago Sousa (id Software), Jean Geffroy (id Software)

GhostRyder · Jul 21, 2016

xkm1948 said:
Reference from Original Review article:
http://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/9

Regarding Maxwell2 Gen cards(980/970)

Regarding Pascal's improvement in terms of Async Compute

So yes, Nvidia did try to improve the Async compute from Maxwell2's non-existent state to a better condition. However under careful anaylsis it is predicted such a small change won't provide much help during DX12/Vulkan age.

Regarding AMD:

In the end it boils down to how soon Nvidia have chosen to make the shift from old ways to unified ALUs.

Then what is pre-emption?

Nvidia tried to cater to current gen and older games by implementing more traditional(that is suitable for DX11) way of GPU design to ensure max performance gain with minimum cost of R&D as well as production. At the same time Nvidia introduces some new workaround for DX12/Vulkan titles that are compute heavy. In this way you get happy customers and great return on revenue for the company. People will buy new cards down the road for DX12/Vuklan titles any way, If I were to work for Nvidia this will be a safe and sound strategy for GPU development for sure.

The generation that comes after Pascal will probably see drastic design change comparing to Maswell-Maxwell2-Pascal. Nvidia will shift heavily towards hardware level async compute and comes out with a super robust design. And maybe by that time the entire Tessellation situation will repeat itself again.

AMD on the other hands have been shifting towards async compute for a long time. I strongly doubt this is related to console development. After all what SONY or MS want is a hardware they can milk on for a long time without worrying the graphic/Visual lacking behind PC too much. DX12/Vulkan helps unlock max potential of hardwares which will great for console development.

My verdict:
Nvidia focus more on PC Consumer experience, which always aims for best performance during the GPU's usefulness life span.

AMD focus more on Console. PC GPU design may just be tagging along. Sometime their assumed "futuristic design" may help in new games. However as soon as Nvidia as shifts towards the direction AMD's design becomes obsolete.

Completely agree, its Nvidias focus vs AMD's focus and quite frankly I do like the idea of the "Future" more at times because I don't replace my cards every year. Unfortunately, one if very obviously winning even if there are big benefits to each idea. But that idea of them always focusing on the "Future" is not the biggest problem this round.

The problem though this round still comes down to one major thing, without a card in the higher areas of the market it does not matter where their focus is whether Async Compute or not because your only pandering to the middle ground and lower (Granted thats a huge majority area). Right now, since there is nothing competing against the 1070/80 and probably won't be for the foreseeable future (Least before December unless something drastic changes) there is nothing that is going to help their reputation and ever shrinking market share especially because the GTX 1060 competes very well in most areas against it (or better). Even with Async Compute and Vulkan getting big gains to AMD even if it say put thcard right at a GTX 1070, thats still not going to be every game running like that which still puts it to far down the list to be something truly special.

When it comes down to it, all that has happened is a waiting game to see what else can happen. Frankly with this situation, I don't have high hopes anymore for anything changing unless either a miracle occurs for AMD or Radeon Graphics get bought. This is unfortunate, because it makes things boring since we already know what our choices are going to be...Least on the high end.

R-T-B · Jul 21, 2016

little cat said:
Who is using double precision for gaming !?

We were talking folding, not gaming.

Frag_Maniac · Jul 21, 2016

I still think most of the problems lie on the game development, not the hardware end. Right now most developers are getting endorsement from Nvidia, and that means they probably also have better insight on what hardware features are going to be needed in the near future. Dx12 so far has been an insignificant factor, as much so as 6 or 8 core CPUs.

Deleted member 50521 · Jul 21, 2016

GhostRyder said:
Completely agree, its Nvidias focus vs AMD's focus and quite frankly I do like the idea of the "Future" more at times because I don't replace my cards every year. Unfortunately, one if very obviously winning even if there are big benefits to each idea. But that idea of them always focusing on the "Future" is not the biggest problem this round.

The problem though this round still comes down to one major thing, without a card in the higher areas of the market it does not matter where their focus is whether Async Compute or not because your only pandering to the middle ground and lower (Granted thats a huge majority area). Right now, since there is nothing competing against the 1070/80 and probably won't be for the foreseeable future (Least before December unless something drastic changes) there is nothing that is going to help their reputation and ever shrinking market share especially because the GTX 1060 competes very well in most areas against it (or better). Even with Async Compute and Vulkan getting big gains to AMD even if it say put thcard right at a GTX 1070, thats still not going to be every game running like that which still puts it to far down the list to be something truly special.

When it comes down to it, all that has happened is a waiting game to see what else can happen. Frankly with this situation, I don't have high hopes anymore for anything changing unless either a miracle occurs for AMD or Radeon Graphics get bought. This is unfortunate, because it makes things boring since we already know what our choices are going to be...Least on the high end.

With the way GCN is designed there is no way for Vega to compete against pascal for current gen games. AMD can stuff up to 8192ALU into vega and make it run at 2GHz. However without developer support to properly utilize the massive amount of ALUs with async engine, AMD is still gonna fail.

little cat · Jul 21, 2016

Frag Maniac said:
I still think most of the problems lie on the game development, not the hardware end. Right now most developers are getting endorsement from Nvidia, and that means they probably also have better insight on what hardware features are going to be needed in the near future. Dx12 so far has been an insignificant factor, as much so as 6 or 8 core CPUs.

i dealt with other software not games but if a core is dedicated to the OS , another to the player and the third to the enemies it gives 3 cores for parallel processing !

Frag_Maniac · Jul 21, 2016

little cat said:
i dealt with other software not games but if a core is dedicated to the OS , another to the player and the third to the enemies it gives 3 cores for parallel processing !

It only matters what IS being done in game development, not what CAN be done. At the look of the current state of things, it's questionable if PC gaming will EVER be up to the the capability of the hardware. Sadly it's a very political thing, controlled mostly by console systems manufacturers.

Deleted member 50521 · Jul 22, 2016

Take home message of my entire discussion:

If I were to run nvidia, I would totally NOT implement async even for Pascal.

FordGT90Concept · Jul 22, 2016

Pascal can't because the hardware can't handle it. They need a new architecture with something like GCN's Asynchronous Compute Engine: something that finds the idle shaders and puts them to work on something else.

System Name	Compy 386
Processor	7800X3D
Motherboard	Asus
Cooling	Air for now.....
Memory	64 GB DDR5 6400Mhz
Video Card(s)	7900XTX 310 Merc
Storage	Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s)	55" Samsung 4K HDR
Audio Device(s)	ATI HDMI
Mouse	Logitech MX518
Keyboard	Razer
Software	A lot.
Benchmark Scores	Its fast. Enough.

System Name	BY-2021
Processor	AMD Ryzen 7 5800X (65w eco profile)
Motherboard	MSI B550 Gaming Plus
Cooling	Scythe Mugen (rev 5)
Memory	2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s)	AMD Radeon RX 7900 XT
Storage	Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s)	Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case	Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s)	Realtek ALC1150, Micca OriGen+
Power Supply	Enermax Platimax 850w
Mouse	Nixeus REVEL-X
Keyboard	Tesoro Excalibur
Software	Windows 10 Home 64-bit
Benchmark Scores	Faster than the tortoise; slower than the hare.

Processor	Ryzen 7800X3D
Motherboard	MSI MAG Mortar B650 (wifi)
Cooling	be quiet! Dark Rock Pro 4
Memory	32GB Kingston Fury
Video Card(s)	MSI RTX 5080 Vanguard SOC
Storage	Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s)	LG 32" 165Hz 1440p GSYNC
Case	Asus Prime AP201
Audio Device(s)	On Board
Power Supply	be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software	W10

System Name	Thought I'd be done with this by now
Processor	i7 11700k 8/16
Motherboard	MSI Z590 Pro Wifi
Cooling	Be Quiet Dark Rock Pro 4, 9x aigo AR12
Memory	32GB GSkill TridentZ Neo DDR4-4000 CL18-22-22-42
Video Card(s)	MSI Ventus 2x Geforce RTX 3070
Storage	1TB MX300 M.2 OS + Games, + cloud mostly
Display(s)	Samsung 40" 4k (TV)
Case	Lian Li PC-011 Dynamic EVO Black
Audio Device(s)	onboard HD -> Yamaha 5.1
Power Supply	EVGA 850 GQ
Mouse	Logitech wireless
Keyboard	same
VR HMD	nah
Software	Windows 10
Benchmark Scores	no one cares anymore lols

System Name	RyzenGtEvo/ Asus strix scar II
Processor	Amd R5 5900X/ Intel 8750H
Motherboard	Crosshair hero8 impact/Asus
Cooling	360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory	Gskill Trident Z 3900cas18 32Gb in four sticks./16Gb/16GB
Video Card(s)	Asus tuf RX7900XT /Rtx 2060
Storage	Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s)	Samsung UAE28"850R 4k freesync.dell shiter
Case	Lianli 011 dynamic/strix scar2
Audio Device(s)	Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply	corsair 1200Hxi/Asus stock
Mouse	Roccat Kova/ Logitech G wireless
Keyboard	Roccat Aimo 120
VR HMD	Oculus rift
Software	Win 10 Pro
Benchmark Scores	laptop Timespy 6506

Anandtech Review Summerize Up Pascal's "Async Compute" Truth

jabbadap

Steevo

FordGT90Concept

"I go fast!1!11!1!"

the54thvoid

Super Intoxicated Moderator

yogurt_21

TheoneandonlyMrK

jabbadap

little cat

jabbadap

GhostRyder

R-T-B

Frag_Maniac

Deleted member 50521

Guest

little cat

Frag_Maniac

Deleted member 50521

Guest

FordGT90Concept

"I go fast!1!11!1!"

System Name	SnowFire / The Reinforcer
Processor	i7 10700K 5.1ghz (24/7) / 2x Xeon E52650v2
Motherboard	Asus Strix Z490 / Dell Dual Socket (R720)
Cooling	RX 360mm + 140mm Custom Loop / Dell Stock
Memory	Corsair RGB 16gb DDR4 3000 CL 16 / DDR3 128gb 16 x 8gb
Video Card(s)	GTX Titan XP (2025mhz) / Asus GTX 950 (No Power Connector)
Storage	Samsung 970 1tb NVME and 2tb HDD x4 RAID 5 / 300gb x8 RAID 5
Display(s)	Acer XG270HU, Samsung G7 Odyssey (1440p 240hz)
Case	Thermaltake Cube / Dell Poweredge R720 Rack Mount Case
Audio Device(s)	Realtec ALC1150 (On board)
Power Supply	Rosewill Lightning 1300Watt / Dell Stock 750 / Brick
Mouse	Logitech G5
Keyboard	Logitech G19S
Software	Windows 11 Pro / Windows Server 2016

System Name	Pioneer
Processor	Ryzen 9 9950X
Motherboard	MSI MAG X670E Tomahawk Wifi
Cooling	Noctua NH-D15 + A whole lotta Sunon, Phanteks and Corsair Maglev blower fans...
Memory	128GB (4x 32GB) G.Skill Flare X5 @ DDR5-4200(Running 1:1:1 w/FCLK)
Video Card(s)	XFX RX 7900 XTX Speedster Merc 310
Storage	Intel 5800X Optane 800GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs, 1x 2TB Seagate Exos 3.5"
Display(s)	55" LG 55" B9 OLED 4K Display
Case	Thermaltake Core X31
Audio Device(s)	TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply	FSP Hydro Ti Pro 850W
Mouse	Logitech G305 Lightspeed Wireless
Keyboard	WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software	Gentoo Linux x64, other office machines run Windows 11 Enterprise

System Name	Space Station
Processor	Intel 13700K
Motherboard	ASRock Z790 PG Riptide
Cooling	Arctic Liquid Freezer II 420
Memory	Corsair Vengeance 6400 2x16GB @ CL34
Video Card(s)	PNY RTX 4080
Storage	SSDs - Nextorage 4TB, Samsung EVO 970 500GB, Plextor M5Pro 128GB, HDDs - WD Black 6TB, 2x 1TB
Display(s)	LG C3 OLED 42"
Case	Corsair 7000D Airflow
Audio Device(s)	Yamaha RX-V371
Power Supply	SeaSonic Vertex 1200w Gold
Mouse	Razer Basilisk V3
Keyboard	Bloody B840-LK
Software	Windows 11 Pro 23H2