Thursday, January 14th 2021

AMD Talks Zen 4 and RDNA 3, Promises to Offer Extremely Competitive Products

AMD is always in development mode and just when they launch a new product, the company is always gearing up for the next-generation of devices. Just a few months ago, back in November, AMD has launched its Zen 3 core, and today we get to hear about the next steps that the company is taking to stay competitive and grow its product portfolio. In the AnandTech interview with Dr. Lisa Su, and The Street interview with Rick Bergman, the EVP of AMD's Computing and Graphics Business Group, we have gathered information about AMD's plans for Zen 4 core development and RDNA 3 performance target.

Starting with Zen 4, AMD plans to migrate to the AM5 platform, bringing the new DDR5 and USB 4.0 protocols. The current aim of Zen 4 is to be extremely competitive among competing products and to bring many IPC improvements. Just like Zen 3 used many small advances in cache structures, branch prediction, and pipelines, Zen 4 is aiming to achieve a similar thing with its debut. The state of x86 architecture offers little room for improvement, however, when the advancement is done in many places it adds up quite well, as we could see with 19% IPC improvement of Zen 3 over the previous generation Zen 2 core. As the new core will use TSMC's advanced 5 nm process, there is a possibility to have even more cores found inside CCX/CCD complexes. We are expecting to see Zen 4 sometime close to the end of 2021.
When it comes to RDNA 3, the company has plans to offer an architecture that has a high performance-per-watt. Just like AMD improved performance-per-watt of RDNA 2, it plans to do the same with RDNA 3, bringing the efficiency of the architecture to the first spot and making it very high-performance for any possible task.
Sources: AnandTech, The Street, via WCCFTech
Add your own comment

62 Comments on AMD Talks Zen 4 and RDNA 3, Promises to Offer Extremely Competitive Products

#1
HD64G
I cannot "see" any big IPC improvements from now on. 10% max from gen to gen is my prediction. Zen3 made a huge jump. Clocks and efficiency will determine the progress until new materials for transistors are used that will allow big clock jumps (graphite anyone?).

As for RDNA3, multiple dies on a GPU would be the next big thing for GPUs if connected properly to have low latency. Other than that, clocks and efficiency will be the main progress target as usual.

To sum it up, TSMS's manufacturing progress and capacity will be the limiting factor for the PC sector's performance progress.
Posted on Reply
#2
ixi
Maybe, just maybe will hold down till am5 comes out... Few succesful months without pc, hehe.

Would be cool if they really releases zen 4 this time.
Posted on Reply
#3
Mathragh
HD64G
I cannot "see" any big IPC improvements from now on. 10% max from gen to gen is my prediction. Zen3 made a huge jump. Clocks and efficiency will determine the progress until new materials for transistors are used that will allow big clock jumps (graphite anyone?).

As for RDNA3, multiple dies on a GPU would be the next big thing for GPUs if connected properly to have low latency. Other than that, clocks and efficiency will be the main progress target as usual.

To sum it up, TSMS's manufacturing progress and capacity will be the limiting factor for the PC sector's performance progress.
One only has to take a look at the recently released M1 Chip by Apple to see that big IPC improvements are apparently quite doable still. The M1 chip can keep up or even exceed Intel and AMDs best at singlethreaded performance while the M1's clock speeds is quite a bit lower.
Posted on Reply
#4
stimpy88
I would expect a 10-15% IPC increase over Zen3+, a new IO die, PCI-Express 5.0, DDR5, a possible core count increase and/or improvements to SMT. I don't expect much in the way of higher peak clockspeeds, but maybe a minor bump to the all cores clockspeed. But this should be an exciting update, and a solid new platform in AM5 for AMD to build on.

On the graphics front, I expect AMD to go all-in on Ray-Tracing and geometry this iteration. I expect them to match, or slightly beat nVidias current RTX gen. nVidia will retake the RT perf crown with whatever architecture they have out next, but the RT wars will properly start in 2021.
Posted on Reply
#5
TheLostSwede
Mathragh
One only has to take a look at the recently released M1 Chip by Apple to see that big IPC improvements are apparently quite doable still. The M1 chip can keep up or even exceed Intel and AMDs best at singlethreaded performance while the M1's clock speeds is quite a bit lower.
Sorry, but that's not IPC. The actual CPU cores aren't that fast, the reason these SoCs keep up are with the help of lots and lots of accelerators that help speed up tasks where the CPU cores are too slow to keep up. By utilising task specific co-processors (as do almost all ARM CPUs), it's possible to offer good system performance, without having great IPC.
Posted on Reply
#6
Dredi
TheLostSwede
Sorry, but that's not IPC. The actual CPU cores aren't that fast, the reason these SoCs keep up are with the help of lots and lots of accelerators that help speed up tasks where the CPU cores are too slow to keep up. By utilising task specific co-processors (as do almost all ARM CPUs), it's possible to offer good system performance, without having great IPC.
You are incorrect in your assumption. The M1 performs as well as intel’s fastest in most single threaded tasks even without the accelerators. Read the anandtech article about it: www.anandtech.com/show/16252/mac-mini-apple-m1-tested
What the accelerators do enable is truly excellent performance per watt in select use cases like watching videos or video calls, or encoding stuff.
Posted on Reply
#7
Mathragh
TheLostSwede
Sorry, but that's not IPC. The actual CPU cores aren't that fast, the reason these SoCs keep up are with the help of lots and lots of accelerators that help speed up tasks where the CPU cores are too slow to keep up. By utilising task specific co-processors (as do almost all ARM CPUs), it's possible to offer good system performance, without having great IPC.
Okay, but where would you draw the line then? Isn't something like accelerating AES cryptography workloads, or AVX-something, or that special integer operation making Smart Access Memory worthwhile in the newest Zen3 cores just the same thing?
Posted on Reply
#8
ncrs
TheLostSwede
Sorry, but that's not IPC. The actual CPU cores aren't that fast, the reason these SoCs keep up are with the help of lots and lots of accelerators that help speed up tasks where the CPU cores are too slow to keep up. By utilising task specific co-processors (as do almost all ARM CPUs), it's possible to offer good system performance, without having great IPC.
So you're saying Apple has magic technology that makes general purpose code run on fixed-function hardware accelerators? Or did they tune their chip specifically for GeekBench? ;)
Posted on Reply
#9
Dredi
ncrs
So you're saying Apple has magic technology that makes general purpose code run on fixed-function hardware accelerators? Or did they tune their chip specifically for GeekBench? ;)
Or maybe they built a general purpose accelerator :toast:
Posted on Reply
#10
HABO
Mathragh
One only has to take a look at the recently released M1 Chip by Apple to see that big IPC improvements are apparently quite doable still. The M1 chip can keep up or even exceed Intel and AMDs best at singlethreaded performance while the M1's clock speeds is quite a bit lower.
You are wrong mate... this is different architecture and IPC gains are achieved here little differently. x86 has big problem with decoders, best x86 chips have 4 decoders because of their complexity and variable instruction length. M1 chip is good in many ways but strongest point of this chip is 8 decoders for instructions. They are able to double decoders because of fix instruction length to 4 bytes. Its easy for them to cut code to 4 byte chunks and feed decoders with those data. This is not replicable on x86.
Posted on Reply
#11
TheLostSwede
Dredi
You are incorrect in your assumption. The M1 performs as well as intel’s fastest in most single threaded tasks even without the accelerators. Read the anandtech article about it: www.anandtech.com/show/16252/mac-mini-apple-m1-tested
What the accelerators do enable is truly excellent performance per watt in select use cases like watching videos or video calls, or encoding stuff.
It's not an assumption dude, it's a fact. Yes, Apple does well in some of the SPEC benchmarks, but so do other ARM based processors, which you can find in other Anandtech articles, such as this one www.anandtech.com/show/16315/the-ampere-altra-review/5
Also keep in mind that Apple owns the compilers and the OS as well, so they have a different level of system wide control that no other hardware company has today. This allows them to squeeze out extra performance that the competition can not.
There are barely any benchmarks to test the CPU with yet, so let's wait until there are some proper benchmarks out.
I'm not saying Apple made a bad chip, just that you're misunderstanding what IPC means.
Mathragh
Okay, but where would you draw the line then? Isn't something like accelerating AES cryptography workloads, or AVX-something, or that special integer operation making Smart Access Memory worthwhile in the newest Zen3 cores just the same thing?
I'm not saying it's wrong, I'm saying it's not IPC, as in instructions per clock.
Accelerators are fine, I mean, they are there to handle the things that the general purpose CPU cores aren't good at.
It's just that people need to make a difference between overall system performance and IPC, as the two are not directly related these days.
ncrs
So you're saying Apple has magic technology that makes general purpose code run on fixed-function hardware accelerators? Or did they tune their chip specifically for GeekBench? ;)
In this case, I would simply say that all Apple hardware gets preferential GeekBench scores. Also, it's a pretty crap benchmark.
Posted on Reply
#12
TumbleGeorge
AleksandarK
extremely competitiv
I like when "extremely" is before of "competentive" but if it is not used in "on extremely expensive prices of products of both competitors" in same sentence.
Posted on Reply
#13
Dredi
TheLostSwede
It's not an assumption dude, it's a fact.
No it’s not. Just paste your proof for your ”fact” here if it is as you say.
TheLostSwede
There are barely any benchmarks to test the CPU with you, so let's wait until there are some proper benchmarks out.
Cinebench, SPEC, geekbench? Or are you stating that for example SPEC is a bad test for gauging general purpose computational power? Why is that? You can compile and run anything on the new systems and the performance is great for almost anything.
TheLostSwede
I'm not saying Apple made a bad chip, just that you're misunderstanding what IPC means.
No, you just don’t seem to understand what the accelerators inside M1 are capable of, and what part of the performance comes from the (possibly) highest IPC processor core in a consumer product to date.
Posted on Reply
#14
TheLostSwede
Dredi
No it’s not. Just paste your proof for your ”fact” here if it is as you say.

Cinebench, SPEC, geekbench? Or are you stating that for example SPEC is a bad test for gauging general purpose computational power? Why is that? You can compile and run anything on the new systems and the performance is great for almost anything.

No, you just don’t seem to understand what the accelerators inside M1 are capable of, and what part of the performance comes from the (possibly) highest IPC processor core in a consumer product to date.
Seriously, I don't think you can read.
IPC is NOT what you think it is.
Posted on Reply
#15
Dredi
TheLostSwede
Seriously, I don't think you can read.
IPC is NOT what you think it is.
IPC is instructions per clock. If the performance of the new M1 in some application or benchmark, say darktable or SPEC, is the same as on the latest intel tiger lake processor, the amount of instructions in the executables is similar and the clock speed of M1 is less than that of the tiger lake processor, it means that the IPC is higher on M1 than on tiger lake.

What am I missing here?

edit: for example take these results:

The clock speed of the ryzen 5950x in this test is somewhere between 4.5 and 5 GHz. The M1 runs around 3.2 GHz. Assuming that the compiled executables for SPEC contain around the same amount of instructions for both architectures the IPC of M1 is around 45% higher than ryzen 5950x for single threaded use.

And to clarify, the above test makes no use of fixed function accelerators.
Posted on Reply
#16
DemonicRyzen666
IPC fluctuates with each instruction set
cpu A can execute X amount of it while running SSE code y
but Y on B cpu is returned faster more quickly
Both cpu A and B have the same IPC because the Instructions Pre-clock is the same how ever the return is not. Efficiencies in designs can return more.

If you took all the SSE/AVX/AES/FMA3 and changed them so they ran perfectly from each of a single core each when ran you'd have an X86 cpu with about 14 cores that great at single thread for a given program but junk at multi thread and large increase in silicon space.

sometimes being good a one thing makes you bad at others,
Posted on Reply
#17
Mathragh
TheLostSwede
Seriously, I don't think you can read.
IPC is NOT what you think it is.
Generally in the (review) industry, IPC is being used roughly as saying "Average amount of work done per clockcycle", at least AFAIK. It is in that spirit that I was using the term and assumed you were as well. If you want to limit the use of the term "IPC" to just the actual instructions a CPU core on average can decode and process per clock then that's fine with me:) Not sure which use is best for the discussion at hand however.

Whichever way you look at it, Seeing the M1 run software that's not even compiled for the architecture, and doing it this quickly to me shows that increasing the processing per clock at least isn't an impossibility. x86 Makers might need to further virtualize their decoding hardware/stack to reach that state however.

PS: Regarding the singlecore perfomance, decoders and instructions, does your view somewhat relate to this story i just found? Exclusive: Why Apple M1 Single "Core" Comparisons Are Fundamentally Flawed (With Benchmarks) (wccftech.com)
Posted on Reply
#18
Dredi
DemonicRyzen666
IPC fluctuates with each instruction set
Yes! And ARM64 is often said to be a ”lighter” instruction set than x86-64, meaning that the M1 IPC is likely going to be even higher as more instructions are needed for the same algorithms than what one might need on x86-64.
Posted on Reply
#19
Vayra86
ncrs
So you're saying Apple has magic technology that makes general purpose code run on fixed-function hardware accelerators? Or did they tune their chip specifically for GeekBench? ;)
Yes, or rather, Geekbench suits the new Apple arch surprisingly well.

This is why Geekbench is a POS. Performance didn't suddenly appear out of nowhere, and people don't differentiate between Geekbench VERSIONS even though the app changes all the time.
Posted on Reply
#20
ncrs
Vayra86
Yes, or rather, Geekbench suits the new Apple arch surprisingly well.

This is why Geekbench is a POS. Performance didn't suddenly appear out of nowhere, and people don't differentiate between Geekbench VERSIONS even though the app changes all the time.
True, but it's not only Geekbench that's showing "abnormal" results for M1. Both Anandtech and Phoronix show very good performance even when running in Rosetta mode.
Posted on Reply
#21
Turmania
I want ryzen 7 5800x successor to at least hit 5 ghz on all 8 cores out of the box. same tdp. basically we need to pass this number. I know the number does not mean much today but it is about time!
Posted on Reply
#22
JB_Gamer
TheLostSwede
It's not an assumption dude, it's a fact. Yes, Apple does well in some of the SPEC benchmarks, but so do other ARM based processors, which you can find in other Anandtech articles, such as this one www.anandtech.com/show/16315/the-ampere-altra-review/5
Also keep in mind that Apple owns the compilers and the OS as well, so they have a different level of system wide control that no other hardware company has today. This allows them to squeeze out extra performance that the competition can not.
There are barely any benchmarks to test the CPU with yet, so let's wait until there are some proper benchmarks out.
I'm not saying Apple made a bad chip, just that you're misunderstanding what IPC means.


I'm not saying it's wrong, I'm saying it's not IPC, as in instructions per clock.
Accelerators are fine, I mean, they are there to handle the things that the general purpose CPU cores aren't good at.
It's just that people need to make a difference between overall system performance and IPC, as the two are not directly related these days.


In this case, I would simply say that all Apple hardware gets preferential GeekBench scores. Also, it's a pretty crap benchmark.
OT: There's so many opinions about benchmarking, if not Geekbench, which benchmark do You consider more/most relevant?
Posted on Reply
#23
ratirt
JB_Gamer
OT: There's so many opinions about benchmarking, if not Geekbench, which benchmark do You consider more/most relevant?
The opinion is like an ass. Everyone got its own but the facts say what the truth is. You can have an opinion but it doesn't have to rely on a fact but you can't have a fact relying on an opinion but truth instead.
Posted on Reply
#24
Vya Domus
ncrs
Or did they tune their chip specifically for GeekBench?
Probably there is some of that, however Geekbench itself is without question tuned for their chips.
HABO
x86 has big problem with decoders
No it doesn't, show me an example where an x86 processor is decode bound.
Dredi
Assuming that the compiled executables for SPEC contain around the same amount of instructions for both architectures the IPC of M1 is around 45% higher than ryzen 5950x for single threaded use.
There isn't a single reason to believe they would generate the same amount of instructions. And that wouldn't even mean anything, the problem is with the optimizations that the compilers themselves apply.
DemonicRyzen666
IPC fluctuates with each instruction set
IPC fluctuates according to architecture, in fact it even fluctuates within the same architecture. A processor never has a constant IPC, that's quite literally impossible.

You can come up with an "average IPC" but that wouldn't mean much either.
Posted on Reply
#25
Frick
Fishfaced Nincompoop
Mathragh
One only has to take a look at the recently released M1 Chip by Apple to see that big IPC improvements are apparently quite doable still. The M1 chip can keep up or even exceed Intel and AMDs best at singlethreaded performance while the M1's clock speeds is quite a bit lower.
Apples to oranges, literally.
Posted on Reply
Add your own comment