Monday, October 5th 2015

AMD Zen Features Double the Per-core Number Crunching Machinery to Predecessor

AMD "Zen" CPU micro-architecture has a design focus on significantly increasing per-core performance, particularly per-core number-crunching performance, according to a 3DCenter.org report. It sees a near doubling of the number of decoder, ALU, and floating-point units per-core, compared to its predecessor. In essence, the a Zen core is AMD's idea of "what if a Steamroller module of two cores was just one big core, and supported SMT instead."

In the micro-architectures following "Bulldozer," which debuted with the company's first FX-series socket AM3+ processors, and running up to "Excavator," which will debut with the company's "Carrizo" APUs, AMD's approach to CPU cores involved modules, which packed two physical cores, with a combination of dedicated and shared resources between them. It was intended to take Intel's Core 2 idea of combining two cores into an indivisible unit further.
AMD's approach was less than stellar, and was hit by implementation problems, where software sequentially loaded cores in a multi-module processor, resulting in a less than optimal scenario than if they were to load one core per module first, and then load additional cores across modules. AMD's workaround tricked software (particularly OS schedulers) into thinking that a "module" was a "core" which had two "threads" (eg: an eight-core FX-8350 would be seen by software as a 4-core processor with 8 threads).

In AMD's latest approach with "Zen," the company did away with the barriers that separated two cores within a module. It's one big monolithic core, with 4 decoders (parts which tell the core what to do), 4 ALUs ("Bulldozer" had two per core), and four 128-bit wide floating-point units, clubbed in two 256-bit FMACs. This approach nearly doubles the per-core number-crunching muscle. AMD implemented an Intel-like SMT technology, which works very similar to HyperThreading. Source: 3DCenter.org
Add your own comment

85 Comments on AMD Zen Features Double the Per-core Number Crunching Machinery to Predecessor

#1
NC37
Just hope AMD isn't going to try to charge a premium for it. Course if they'll finally have CPUs that will go toe to toe with Intel then I'm sure they will.
Posted on Reply
#2
bubbly1724
They better deliver this time or they won't have anything left. And the "what if a Steamroller module of two cores was just one big core, and supported SMT instead." sounds like reverse hyperthreading or something, which a lot of people were speculating.
Posted on Reply
#3
geon2k2
If they fail, they fail for good.
Apple with A9 just proved that ARM is indeed a solid competitor for Intel so there will be nobody to support AMD for competition sake and they can just die in peace.

Considering though that Intel brought nothing to the table since Sandy Bridge, they might have a chance. (lower lithography gives better power, and very slightly better performance which will be null, when Zen will come, cpu graphics is irrelevant for performance machines, and the rest of the performance increase over sandy is mostly due to higher stock clocks)
Posted on Reply
#5
hellowalkman
Zen seems to have success written all over it which is good news for everyone .. :)
Posted on Reply
#6
john_
I wonder how a Zen core will compare to a Thuban core. That way we will have a real idea about what performance increase we have from AMD after 5 years. Because Bulldozer was one or more steps backwards.
Posted on Reply
#7
hellowalkman
john_, post: 3353137, member: 137560"
I wonder how a Zen core will compare to a Thuban core. That way we will have a real idea about what performance increase we have from AMD after 5 years. Because Bulldozer was one or more steps backwards.
Thuban IPC is in between Steamroller and Excavator I believe ..
Posted on Reply
#8
Assimilator
geon2k2, post: 3353125, member: 156730"
Apple with A9 just proved that ARM is indeed a solid competitor for Intel
In the mobile space. Apple has no intention of competing with Intel on desktop, which is the whole point of AMD.
Posted on Reply
#9
Ebo
#6

1.Not really, problem with Bulldozer was/is too long a pipeline to run 2 cycles at the same time.

2. They(AMD) hadent more power that I5-2500K especially when that was Oc'ed.

3. The industry didnt go the way AMD had chozen to focus on, just execpt that Bulldozer actually was/is a fine server CPU fore that inviroment at the time when it came out. It wasent intended 110% for gaming, the faults the design had from the start was parcially solved with Visheara core, but thats too old now.

4. if the Zen design works, and offers better preformance that I get from my system today, it will be changed in a heartbeat.
Posted on Reply
#10
lilhasselhoffer
Thuban was a 45 nm process. While not too bad for its day, AMD is working with the 14 nm process now, correct?

If Zen was just a shrunk down Thuban they'd be working with somewhere between 7 and 9 times as many transistors squashed into the same approximate space (yeah, not exactly accurate, but 90 nm between features and 28 nm is just a ballpark).


What I'd compare Zen to is Sandy Bridge. Hear me out, because off hand that is a low bar. What I'd conjecture is needed is good overclocking, a great pricing, DDR4, SATA III, and an ejection of the iGPU theory. Points 1 and 2 are generally where AMD focuses, so we're good there. Points 3 and 4 are what AMD promised with the ejection of the AM3+ socket. The final point is AMD utilizing all of the die space they can to overcome R&D shortcomings. If AMD can release a desktop CPU that genuinely does all of that, I would gladly go to it rather than a similarly priced Intel offering. Everything since SB has been either a compromise in overclocking, a compromise in performance (FIVR, sigh), or a compromise in cost (DDR4 really isn't yet performing well enough to justify the upgrade cost).

Zen could be the first step in AMD getting back to work on good CPUs. It could also be too little too late. Let's wait and see, before passing judgement.


Edit:
I have made a mistake. As per TeNor's correction, the 12 nm process has been changed to a 14 nm process. Much obliged for the correction.
Posted on Reply
#11
micropage7
nice they work for performance per- core
im kinda sick of their many cores and high Ghz but it cant challenge Intel processor
just make mid range processor with better performance per-core and lower power consumption, i guess it would help them in the market much
Posted on Reply
#12
bug
Number crunching? That's a little suspect.
We already know AMD is using one FPU for every two CPU cores. I hope adding a FPU for each core is NOT the best feature Zen has to offer.
Posted on Reply
#13
TeNor
#11

As far as it can be known AMD will release Zen on 14nm (GloFo) or 16nm (TSMC) FinFET technology.

By the way you are right when you say you'd compare Zen to SB. If Zen reaches SB's performance level I would say well done!

Based on my own Cinebench R15 single thread results calculations, SB has app. 45-50% more IPC than Piledriver/Steamroller and ~30% more than K10. (See how bad is the Bulldozer family?) So reaching SB's performance level would be a great leap forward.

Another question is that it'd be still behind Intel's actual performance level.
Posted on Reply
#14
Chaitanya
I will believe when I see reviews from independent authority.
Posted on Reply
#15
bpgt64
I have all the hope in the world of Zen/AMD, but I will definitely be waiting for a review. However, if Zen gives us a 16 Core Desktop processor that's within 80% of Haswell Single Threaded performance, I'll be switching...Having a 16 core monster sounds awesome. Especially considering how Intel has relegated it's 8+ cores to Servers/Xeons for the most part.
Posted on Reply
#16
geon2k2
Assimilator, post: 3353148, member: 7058"
In the mobile space. Apple has no intention of competing with Intel on desktop, which is the whole point of AMD.
Well they could if they want.
They have 2500 geekbench single thread score at 1.8 Ghz and in a very power restricted environment.

http://cdn.arstechnica.net/wp-content/uploads/2015/09/charts.0011.png

An i5 4440 at 3.1 has ~2900 in the same test.

http://browser.primatelabs.com/geekbench3/search?utf8=✓&q=i5+4440

And the FX8350 is around 2400 :)

http://browser.primatelabs.com/geekbench3/search?utf8=✓&q=fx+8350

They are definitely competitive and that is for sure desktop class CPU and if they could push ARM so far, I'm sure others will soon follow and there are big heavy names there: Qualcomm, Samsung, nVidia ...
Posted on Reply
#17
mastrdrver
Original source

I might be worth noting that Jim Keller worked with DEC in the late 90s when DEC first developed the idea of SMT.

It's believed that the processor that would have come out after the first one with SMT would have gone from 2 threads per core to 4. Some have suggested that one of the changes that will come to Zen+ (the successor to Zen) will make it so it's 4 threads per core.
Posted on Reply
#18
dj-electric
Did nobody asked:

If Zen is so promising, why did Keller leave after he finished the project?
Posted on Reply
#19
happita
Dj-ElectriC, post: 3353235, member: 87186"
Did nobody asked:

If Zen is so promising, why did Keller leave after he finished the project?
The question has been tackled 100 times. I'll make it 101... it's because he finished his job (contract) and now he has nothing else to do and on top of it AMD can't afford to keep him on for future projects it seems.
Posted on Reply
#20
Random Murderer
The Anti-Midas
Dj-ElectriC, post: 3353235, member: 87186"
Did nobody asked:

If Zen is so promising, why did Keller leave after he finished the project?
Because that's what Keller does; he finishes an architecture and then jumps ships to work on something different. It's not just AMD he's done this to(though this makes the third time he's done it to AMD), he did it to Apple, as well as IBM IIRC.
Posted on Reply
#21
Vinska
Looks like it actually has FOUR TIMES the floating point units.
In bulldozer and later, in full config, there are four FPU2x128bit units, can either act as one 256bit / 2x128bit for a single core or gets split to a single 128bit unit per core on workloads when two cores access the shared FPU unit.
So, by having 4x128bit units per core, in a way, Zen has four times the floating-point units as bulldozer and later.
Posted on Reply
#22
AVXX
If the Greenland 16-core comes to pass...

... and can clock at a respectable 3GHz+ without melting

... and is priced comparably to Intel's high end desktop / low end workstation offerings

... and packs 16 SMT cores with four SSE FMACs each

.. then AMD are well and truly back in the game. At least until such time as Cannonlake arrives.

(If Cannonlake on desktop has 6-8 cores with AVX512 FMACs, AMD's victory may be rather short lived...)
Posted on Reply
#23
lilhasselhoffer
TeNor, post: 3353172, member: 160363"
#11

As far as it can be known AMD will release Zen on 14nm (GloFo) or 16nm (TSMC) FinFET technology.

By the way you are right when you say you'd compare Zen to SB. If Zen reaches SB's performance level I would say well done!

Based on my own Cinebench R15 single thread results calculations, SB has app. 45-50% more IPC than Piledriver/Steamroller and ~30% more than K10. (See how bad is the Bulldozer family?) So reaching SB's performance level would be a great leap forward.

Another question is that it'd be still behind Intel's actual performance level.
Much obliged for the correction. Don't know why 12 nm popped into my head, but it was in error.

If Zen performs as well as SB, per core, it'll knock the ball out of the park. IB was a joke, because of that cheap thermal paste. Haswell brought better paste, but FIVR. Skylake looks to be a genuine upgrade, but DDR4 just isn't worth the extra cost.

By the time DDR4 drops in price, and speeds up, we'll see Zen. If it follows other AMD offerings, we'll have a competent PCH, a focus on being unlocked, and a boat load of cores. SB was locked to 4 cores. Even SB-e topped out at 6 cores. SB-e's PCH was terrible (speaking as an owner, it just didn't have enough of anything without expansion cards). SB overclocked very well, but it suffered the Intel lockdown unless you spent the tax on a K processor.

I'm expecting SB level performance, with more cores, running cooler. With that kind of a base, the overclocking will more than make up the ground for IB and Haswell. It still might be behind Skylake, but those extra cores would make all the difference.


Dj-ElectriC, post: 3353235, member: 87186"
Did nobody asked:

If Zen is so promising, why did Keller leave after he finished the project?
Every time.

Do you ask why the pediatrician isn't your doctor for life? Do you ask why the assembly line worker does only one job, and never actually finishes a car? Do you ask why everyone doesn't cross the finish line in a marathon? If the answer was yes to any of these you might need to seek medical help, due to damaged cognitive functions.

Keller left because his part was over, and he's functionally a mercenary. You hire him, set a goal, put money on the table, and negotiate the contract. Keller doesn't get involved in production, marketing, or support. He designs, then leaves. His career speaks to that tendency, and conflating his leaving with some issue is foolish.
Posted on Reply
#24
GorbazTheDragon
Title is misleading... It only doubles floating point, not integer performance.
Posted on Reply
#25
AVXX
Not entirely true Gorbaz - SSE4.x & AVX2 both support vector integer computation, but the hardware that crunches it still get referred to as FMACs. Depends whether or not the integer code in question can be vectorized.
Posted on Reply
Add your own comment