Friday, January 11th 2019

AMD Radeon VII Detailed Some More: Die-size, Secret-sauce, Ray-tracing, and More

Jan 11th, 2019 00:39 Discuss (154 Comments)

AMD pulled off a surprise at its CES 2019 keynote address, with the announcement of the Radeon VII client-segment graphics card targeted at gamers. We went hands-on with the card earlier this week. The company revealed a few more technical details of the card in its press-deck for the card. To begin with, the company talks about the immediate dividends of switching from 14 nm to 7 nm, with a reduction in die-size from 495 mm² on the "Vega 10" silicon to 331 mm² on the new "Vega 20" silicon. The company has reworked the die to feature a 4096-bit wide HBM2 memory interface, the "Vega 20" MCM now features four 32 Gbit HBM2 memory stacks, which make up the card's 16 GB of memory. The memory clock has been dialed up to 1000 MHz from 945 MHz on the RX Vega 64, which when coupled with the doubled bus-width, works out to a phenomenal 1 TB/s memory bandwidth.

We know from AMD's late-2018 announcement of the Radeon Instinct MI60 machine-learning accelerator based on the same silicon that "Vega 20" features a total of 64 NGCUs (next-generation compute units). To carve out the Radeon VII, AMD disabled 4 of these, resulting in an NGCU count of 60, which is halfway between the RX Vega 56 and RX Vega 64, resulting in a stream-processor count of 3,840. The reduced NGCU count could help AMD harvest the TSMC-built 7 nm GPU die better. AMD is attempting to make up the vast 44 percent performance gap between the RX Vega 64 and the GeForce RTX 2080 with a combination of factors.

First, AMD appears to be maximizing the clock-speed headroom achieved from the switch to 7 nm. The Radeon VII can boost its engine clock all the way up to 1800 MHz, which may not seem significantly higher than the on-paper 1545 MHz boost frequency of the RX Vega 64, but the Radeon VII probably sustains its boost frequencies better. Second, the slide showing the competitive performance of Radeon VII against the RTX 2080 pins its highest performance gains over the NVIDIA rival in the "Vulkan" title "Strange Brigade," which is known to heavily leverage asynchronous-compute. AMD continues to have a technological upper-hand over NVIDIA in this area. AMD mentions "enhanced" asynchronous-compute for the Radeon VII, which means the company may have improved the ACEs (async-compute engines) on the "Vega 20" silicon, specialized hardware that schedule async-compute workloads among the NGCUs. With its given specs, the Radeon VII has a maximum FP32 throughput of 13.8 TFLOP/s

The third and most obvious area of improvement is memory. The "Vega 20" silicon is lavishly endowed with 16 GB of "high-bandwidth cache" memory, which thanks to the doubling in bus-width and increased memory clocks, results in 1 TB/s of memory bandwidth. Such high physical bandwidth could, in theory, allow AMD's designers to get rid of memory compression which probably frees up some of the GPU's number-crunching resources. The memory size also helps. AMD is once again throwing brute bandwidth to overcome any memory-management issues its architecture may have.

The Radeon VII is being extensively marketed as a competitor to GeForce RTX 2080. NVIDIA holds a competitive edge with its hardware being DirectX Raytracing (DXR) ready, and even integrated specialized components called RT cores into its "Turing" GPUs. The "Vega 20" continues to lack such components, however AMD CEO Dr. Lisa Su confirmed at her post-keynote press round-table that the company is working on ray-tracing. "I think ray tracing is important technology; it's something that we're working on as well, from both a hardware/software standpoint."

Responding to a specific question by a reporter on whether AMD has ray-tracing technology, Dr. Su said: "I'm not going to get into a tit for tat, that's just not my style. So I'll tell you that. What I will say is ray tracing is an important technology. It's one of the important technologies; there are lots of other important technologies and you will hear more about what we're doing with ray tracing. You know, we certainly have a lot going on, both hardware and software, as we bring up that entire ecosystem."

One way of reading between the lines would be - and this is speculation on our part - that AMD could working on retrofitting some of its GPUs powerful enough to handle raytracing with DXR support through a future driver update, as well as working on future generations of GPUs with hardware-acceleration for many of the tasks that are required to get hybrid rasterization work (adding real-time raytraced objects to rasterized 3D scenes). Just as real-time raytracing is technically possible on "Pascal" even if daunting on the hardware, with good enough work directed at getting a ray-tracing model to work on NGCUs leveraging async-compute, some semblance of GPU-accelerated real-time ray-tracing compatible with DXR could probably be achieved. This is not a part of the feature-set of Radeon VII at launch.

The Radeon VII will be available from 7th February, priced at $699, which is on-par with the SEP of the RTX 2080, despite the lack of real-time raytracing (at least at launch). AMD could shepherd its developer-relations on future titles being increasingly reliant on asynchronous compute, the "Vulkan" API, and other technologies its hardware is good at.

Add your own comment

154 Comments on AMD Radeon VII Detailed Some More: Die-size, Secret-sauce, Ray-tracing, and More

#26

londiste

CammDedicated Hardware acceleration for RT is a smokescreen IMO, the key is if you can cut down your FP or INT instructions as small as possible and run as many as parallel as possible. AMD does have some FP division capability so its possible that some cards can be retrofitted for RT.

You mean essentially RPM or 4*INT8? Vega brought them into consumer space and got some shiny moments in game performance thanks to it. In the other camp Turing followed suit with including RPM and at least on GPU level 4*INT8 was in Pascal if not earlier.

NkdWho said async compute was dead. It was primitive shader and dsbr that never worked on vega, not async compute. I think you are confused here. Never ever AMD said async compute was not supported or dead.

Async is alive and kicking. However its impact is fairly small. Following what is in the news post Strange Brigade actually gains a few % from Async Compute being enabled, at best. It is definitely a good thing to have but a game changer. It also works fine enough in both camps by now.

#27

INSTG8R

Vanguard Beta Tester

londisteYou mean essentially RPM or 4*INT8? Vega brought them into consumer space and got some shiny moments in game performance thanks to it. In the other camp Turing followed suit with including RPM and at least on GPU level 4*INT8 was in Pascal if not earlier.

Good point. RPM often gets overlooked I know FC5 is using it and I’m gonna assume AC Odyssey would too.

#28

londiste

Hugh MungusWhy is Radeon VII only 7.5% faster in hitman 2?

Isn't Hitman 2 fairly CPU-hungry? AMD's game test results are on i7-7700K.

fynxerI also have a feeling that AMD may be working secretly with Intel on RayTracing tech to sett up a unified standard against nvidias RTX.

Bullshit. DXR in DX12 and Vulkan-RT extensions is as standard as it gets. AMD will do their implementation of these if they know what is good for them (and extend on these if necessary).
Unless you are implying AMD would either try shoving RT into DX11 or into something proprietary? :D

#29

Midland Dog

Pixrazorstill 64 ROPs damn...
so just Vega but at 7nm and more hbm2 to boost up the price.
where is Navi??

apparently its 128

#30

Kaotik

londisteIsn't Hitman 2 fairly CPU-hungry? AMD's game test results are on i7-7700K.

Bullshit. DXR in DX12 and Vulkan-RT extensions is as standard as it gets. AMD will do their implementation of these if they know what is good for them (and extend on these if necessary).
Unless you are implying AMD would either try shoving RT into DX11 or into something proprietary? :D

Last time I cheked Vulkan doesn't have official RT-extensions at this time, NVIDIA was at least trying to push their solution [essentially RTX] as standard but at least so far that hasn't happened to my knowledge

Midland Dogapparently its 128

Unless proven otherwise it should be 64, as the Vega 20 diagrams from Instinct release clearly show 4 Pixel Engines per Shader Engine.
I think the 64/128 confusion comes from the fact that NVIDIA cards have their ROPs tied to memory controllers, so doubling memory controllers should double the ROPs and same logic is applied to Vega on some sites, even though in AMDs case the two aren't tied together

#31

Midland Dog

KaotikLast time I cheked Vulkan doesn't have official RT-extensions at this time, NVIDIA was at least trying to push their solution [essentially RTX] as standard but at least so far that hasn't happened to my knowledge

Unless proven otherwise it should be 64, as the Vega 20 diagrams from Instinct release clearly show 4 Pixel Engines per Shader Engine.
I think the 64/128 confusion comes from the fact that NVIDIA cards have their ROPs tied to memory controllers, so doubling memory controllers should double the ROPs and same logic is applied to Vega on some sites, even though in AMDs case the two aren't tied together

pretty sure GN said 128

#32

INSTG8R

Vanguard Beta Tester

KaotikLast time I cheked Vulkan doesn't have official RT-extensions at this time, NVIDIA was at least trying to push their solution [essentially RTX] as standard but at least so far that hasn't happened to my knowledge

Unless proven otherwise it should be 64, as the Vega 20 diagrams from Instinct release clearly show 4 Pixel Engines per Shader Engine.
I think the 64/128 confusion comes from the fact that NVIDIA cards have their ROPs tied to memory controllers, so doubling memory controllers should double the ROPs and same logic is applied to Vega on some sites, even though in AMDs case the two aren't tied together

Actually the ROPS are tied to the memory which is why it has 16GB. I can’t provide a direct source for this quote but it rings true
`Unfortunately, you can't scale down the HBM2 any further and still retain the 128 ROPs, so 16 GB is the smallest capacity AMD can offer, which is why the pricepoint on this is so close relative to the 2080.
`

#33

Kaotik

Midland Dogpretty sure GN said 128

I'm aware many have said 128, but none of the press material provided by AMD suggests such and Radeon Instint block diagrams say 64, so until AMD itself says 128 or we get benchmarks showing ROP capabilities past 64 units it's more probable option than 128

INSTG8RActually the ROPS are tied to the memory which is why it has 16GB. I can’t provide a direct source for this quote but it rings true
`Unfortunately, you can't scale down the HBM2 any further and still retain the 128 ROPs, so 16 GB is the smallest capacity AMD can offer, which is why the pricepoint on this is so close relative to the 2080.
`

I'm pretty sure they're not in AMDs case, they're just assuming it because they're tied on NVIDIA and most AMD chips use same ROP:Memory Controller -ratio. Fiji for example has 4096-bit HBM memory controller and 64 ROPs, while it should have 128 if the memory controllers and ROPs were tied together. Also, with Vegas HBCC they're even less connected than before, they're actually behind Infinity Fabric -bus now.

For the 16 GB, it's the smallest capacity you can have with 4096-bit HBM2 because no-one makes smaller than 4GB HBM2-stacks.

#34

londiste

KaotikLast time I cheked Vulkan doesn't have official RT-extensions at this time, NVIDIA was at least trying to push their solution [essentially RTX] as standard but at least so far that hasn't happened to my knowledge

You are right, my bad. There are only NV_raytracing extensions for Vulkan that went out of beta. The official answer was that Vulkan allows doing RT already.
I thought Vulkan was supposed to improve on how (badly) OpenGL dealt with extensions :(

#35

INSTG8R

Vanguard Beta Tester

KaotikI'm aware many have said 128, but none of the press material provided by AMD suggests such and Radeon Instint block diagrams say 64, so until AMD itself says 128 or we get benchmarks showing ROP capabilities past 64 units it's more probable option than 128

I'm pretty sure they're not in AMDs case, they're just assuming it because they're tied on NVIDIA and most AMD chips use same ROP:Memory Controller -ratio. Fiji for example has 4096-bit HBM memory controller and 64 ROPs, while it should have 128 if the memory controllers and ROPs were tied together. Also, with Vegas HBCC they're even less connected than before, they're actually behind Infinity Fabric -bus now.

For the 16 GB, it's the smallest capacity you can have with 4096-bit HBM2 because no-one makes smaller than 4GB HBM2-stacks.

We really just need to wait for proper product spec sheets at this point because right now both are being tossed around and nobody seems to have a concrete answer.

#36

Rahmat Sofyan

hats off for Dr.Lisa Su ..

still calm and responded with great answers ..

DXR not ready yet 100%, still plenty of time for RTG and AMD to get ready.

my RX 570 and RX 480 still Okay

#37

renz496

fynxerGood luck with RayTracing in software, if that was viable we would have had that already. If they do it it is just a desperate move not to look obsolete.

Do not expect RayTracing in hardware until end of 2020 and even then they will be years behind nVidia who will, by that time, be in the process of readying their third gen RTX cards for release.

We need Intel to enter the market with RayTracing from the get go in 2020.

I also have a feeling that AMD may be working secretly with Intel on RayTracing tech to sett up a unified standard against nvidias RTX.

RTX is just nvidia fancy name for their hardware implementation. just like when they call their tessellation engine as "polymorph engine". the unified standard for ray tracing already exist in DirectX called DXR. RTX is not some exclusive API like mantle where it can only run on certain hardware. since DXR is the standard in DirectX AMD and Intel will have to follow that standard instead of coming out with new standard.

#38

Assimilator

Probably the most useful thing about this card is that, if it is able to perform anywhere near the RTX 2080, it might well induce NVIDIA to drop the latter's price. Considering how expensive Vega 56/64 were, and remain, I'm pretty sure NVIDIA has a lot more wiggle-room in terms of pricing - and they surely would love to shut AMD out from the high-end GPU market completely, because that woudl guarantee them an effective monopoly on that market segment going forward.

tl;dr NVIDIA might well be willing to drop the price on RTX 2080 to allow them to hike the price on RTX 3000 and all its descendants.

#39

Camm

Assimilatortl;dr NVIDIA might well be willing to drop the price on RTX 2080 to allow them to hike the price on RTX 3000 and all its descendants.

RTX die sizes are huge, I dont think there is much wiggle room at all. Conversely, Vega VII die is much smaller, but HBM is still expensive, so AMD probably doesn't have much room either.

Interesting times.

#40

Aquinus

Resident Wat-man

AssimilatorProbably the most useful thing about this card is that, if it is able to perform anywhere near the RTX 2080, it might well induce NVIDIA to drop the latter's price. Considering how expensive Vega 56/64 were, and remain, I'm pretty sure NVIDIA has a lot more wiggle-room in terms of pricing - and they surely would love to shut AMD out from the high-end GPU market completely, because that woudl guarantee them an effective monopoly on that market segment going forward.

tl;dr NVIDIA might well be willing to drop the price on RTX 2080 to allow them to hike the price on RTX 3000 and all its descendants.

That's predicated on the idea that nVidia's RTX offerings aren't having yield issues which I find hard to believe for the 2080 Ti. As for the 2080, I'm not sure, but it's still a pretty good size die (bigger than a Vega 64.) Honestly, I think nVidia's problem is old inventory. Between that and the less than stellar reception of the RTX chips, investors were not amused.

#41

Mysteoa

AssimilatorProbably the most useful thing about this card is that, if it is able to perform anywhere near the RTX 2080, it might well induce NVIDIA to drop the latter's price.

The price for RTX 2080 is around 750€ where I live and Radeon 7 will probably cost more then 700€ initially.

#42

btarunr

Editor & Senior Moderator

XzibitHere is a better look

I still only count 64 ROPs in that graphic, since each "RB" (render backend) crunches 4 pixels per clock.

#43

Keullo-e

S.T.A.R.S.

Weird that there's just 3840 shaders o_O

#44

Vya Domus

That die size is pretty small and also not all of it is enabled, for the first time in many years AMD has a card that likely has better margins that Nvidia's equivalent. That's a pretty big deal.

#45

ValenOne

ApocalypseeI highly doubt it have 128ROPs. If it did have 64 ROPs then old Vega56/64 are memory bandwidth starved. Then again, all AMD recent GPU are bandwidth starved for example RX470 that uses the same memory as RX480 performs very close to it. GCN is reaching its limits, its good for compute but not as a gaming card. They need to put more than 4 Shader Engines, which in return increase geometry units and number of ROPs.

From www.techpowerup.com/gpu-specs/radeon-rx-vega-m-gh.c3056
Recent "RX Vega M GH" has 64 ROPS

#46

Fluffmeister

According to AMD the cost of 7nm is significant, with 16 hbm2 I can't imagine it's cheap for them, but i assume they are least making some money.

#47

Unregistered

AMD does have pretty much all of the gaming space to consider when developing AMD implementing new features like RT. Having a half-@$$ noisy hybrid ray tracing implementation with little traction like seen presently wouldn't do them, Sony, or Microsoft any favours, nor impress them IMO.

Hopefully AMD learned this from their Async push, which is still great technology, but the software ecosystem wasn't ready a few years ago.

#48

FordGT90Concept

"I go fast!1!11!1!"

FluffmeisterAccording to AMD the cost of 7nm is significant, with 16 hbm2 I can't imagine it's cheap for them, but i assume they are least making some money.

Which is why Vega 20 isn't bigger than Vega 10. I think Huang's explosion is because he realizes he made a "big" mistake with Turing. AMD is focusing on where the money is at, not winning performance crowns that mean little in the larger context of things. Turing is substantially larger (and more costly to produce) than even Vega 10 is.

On topic, Vega 20 doesn't really impress but it really wasn't intended to impress either. Vega 7nm w/ Fiji memory bandwidth.

#49

Aquinus

Resident Wat-man

FordGT90ConceptVega 7nm w/ Fiji memory bandwidth.

I don't think that the added bandwidth is going to make a difference on this chip, that was just a side-effect of putting 16GB on it. I've played around with my own Vega 64 a bit to realize that not only does HBM overclock really well, it also makes practically zero difference in terms of performance, even with a 20% overclock on it. Vega was never starved for memory bandwidth and to me, that says this is totally about capacity. If power consumption could be improved, that would go a long way for Vega.

#50

ssdpro

The news of this article is Radeon VII=$699 2080=$699 for on par performance if you ignore ray tracing/tensor. That exact match pricing for on par minus a few features is not AMD's modus operandi. This needs to be $499-549.

Add your own comment

AMD Radeon VII Detailed Some More: Die-size, Secret-sauce, Ray-tracing, and More

154 Comments on AMD Radeon VII Detailed Some More: Die-size, Secret-sauce, Ray-tracing, and More

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

AMD Radeon VII Detailed Some More: Die-size, Secret-sauce, Ray-tracing, and More

Related News

154 Comments on AMD Radeon VII Detailed Some More: Die-size, Secret-sauce, Ray-tracing, and More

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts