Friday, December 8th 2023

Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo

Intel's upcoming Sierra Forest Xeon server chip has debuted on Geekbench 6, showcasing its potential in multi-core performance. Slated for release in the first half of 2024, Sierra Forest is equipped with up to 288 Efficiency cores, positioning it to compete with AMD's Zen 4c Bergamo server CPUs and other ARM-based server chips like those from Ampere for the favor of cloud service providers (CSP). In the Geekbench 6 benchmark, a dual-socket configuration featuring two 144-core Sierra Forest CPUs was tested. The benchmark revealed a notable multi-core score of 7,770, surpassing most dual-socket systems powered by Intel's high-end Xeon Platinum 8480+, which typically scores between 6,500 and 7,500. However, Sierra Forest's single-core score of 855 points was considerably lower, not even reaching half of that of the 8480+, which manages 1,897 points.

The difference in single-core performance is a matter of choice, as Sierra Forest uses Crestmont-derived Sierra Glen E-cores, which are more power and area-efficient, unlike the Golden Cove P-cores in the Sapphire Rapids-based 8480+. This design choice is particularly advantageous for server environments where high-core counts are crucial, as CSPs usually partition their instances by the number of CPU cores. However, compared to AMD's Bergamo CPUs, which use Zen 4c cores, Sierra Forest lacks pure computing performance, especially in multi-core. The Sierra Forest lacks hyperthreading, while Bergaamo offers SMT with 256 threads on the 128-core SKU. Comparing the Geekbench 6 scores to AMD Bergamo EPYC 9754 and Sierra Forest results look a lot less impressive. Bergamo scored 1,597 points in single-core, almost double that of Sierra Forest, and 16,455 points in the multi-core benchmarks, which is more than double. This is a significant advantage of the Zen 4c core, which cuts down on caches instead of being an entirely different core, as Intel does with its P and E-cores. However, these are just preliminary numbers; we must wait for real-world benchmarks to see the actual performance.
Sources: BenchLeaks, Tom's Hardware
Add your own comment

76 Comments on Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo

#1
Tek-Check
This SKU is going to show finally how much worse e cores are.
Posted on Reply
#2
Haile Selassie
This is a significant advantage of the Zen4c core, which cuts down on caches instead of being an entirely different core,
I think that's a typo, isn't it? Should be lower clocks only, not cut down cache.
Posted on Reply
#3
user556
Well, the same L3 cache is divided between twice the cores so arguably less cache per core, but yeah, lower design frequency is the big difference. Comes from smaller clock tree, less gating, less staging registers, possibly smaller SRAM cells too. Reduced design frequency allows reduced size which inturn compounds because wire lengths are also reduced.
Posted on Reply
#5
bug
Cores designed to be slower are slower than cores designed to be fast... It's like the poster has no understanding of what he's writing about.
This is about server SKUs, individual performance is irrelevant, all that matters is perf/W.
Posted on Reply
#6
Crackong
Bergamo scored 1,597 points in single-core, almost double that of Sierra Forest, and 16,455 points in the multi-core benchmarks, which is more than double.
So much for Intel snake oil......
Posted on Reply
#7
lemonadesoda
But we know that Geekbench 6 is very poor at assessing multicore cpus especially server CPUs and server workloads. Try Geekbench 5 instead. Or a different tool altogether. We’ve had this discussion many times before.
Posted on Reply
#8
fancucker
"Fails" is such a wreckless and irresponsible statement from a journalist of TPU's calibre. These products are designed for specific applications and Intel's tertiary services and easier integration make them a more compelling option than Bergamo.
Posted on Reply
#9
user556
OnasiClocks are lower, true, but that’s par for the course for server hardware.
Ah, but the base Zen4 cores are designed for higher clock rates. Whether they are pushed that fast or not doesn't change the design.

By lowering the design frequency just for the Zen4c cores allows the synthesising software to optimise those cores to be smaller purely because of design frequency.
Posted on Reply
#10
tabascosauz
bugCores designed to be slower are slower than cores designed to be fast... It's like the poster has no understanding of what he's writing about.
This is about server SKUs, individual performance is irrelevant, all that matters is perf/W.
I don't disagree, but just guessing here before real data - how sure are you that Intel would pull out a win on perf/W (or even a tie)?

AMD chiplet design weakness is uncore power overhead in relation to core count. IFOP power hasn't really improved monumentally in the server space, but Zen 4c doubling core count for a given #CCD count is a huge point in Bergamo's favour for perf/W.

You could say that E-cores are pushed too hard in Core I, and are best in their efficiency band running Xeon clocks, but the same goes for Bergamo. Server Zen 4c is also close to its happy place.
Posted on Reply
#11
Onasi
user556Ah, but the base Zen4 cores are designed for higher clock rates. Whether they are pushed that fast or not doesn't change the design.

By lowering the design frequency just for the Zen4c cores allows the synthesising software to optimise those cores to be smaller purely because of design frequency.
Are they, though? Optimized for lower clocks, I mean. New Threadripper 7995WX is Zen4C and that set a record for Cinebench on air cooling running 4.8Ghz on all cores. All 96 of them. And out of the box boost clock is also a respectable 5.1. So it doesn’t feel like the frequency is that much lower than regular Zen4.
I am not disagreeing with you, by the way, just saying that “lower” is relative in this case.
Posted on Reply
#12
user556
OnasiAre they, though? Optimized for lower clocks, I mean. New Threadripper 7995WX is Zen4C and that set a record for Cinebench on air cooling running 4.8Ghz on all cores. All 96 of them. And out of the box boost clock is also a respectable 5.1. So it doesn’t feel like the frequency is that much lower than regular Zen4.
I am not disagreeing with you, by the way, just saying that “lower” is relative in this case.
Nooo, none of the Threadrippers are Zen4c. The biggest Zen4c is 128 cores, not 96 cores.
Posted on Reply
#13
Onasi
user556Nooo, none of the Threadrippers are Zen4c. The biggest Zen4c is 128 cores, not 96 cores.
I was genuinely sure that the biggest WX is a 4c part. Huh, guess not. Not sure why I was so convinced that it was. Mandela effect, I guess.
Posted on Reply
#14
bug
fancucker"Fails" is such a wreckless and irresponsible statement from a journalist of TPU's calibre. These products are designed for specific applications and Intel's tertiary services and easier integration make them a more compelling option than Bergamo.
I wonder if we could block editors the same way we can block annoying users :D
tabascosauzI don't disagree, but just guessing here before real data - how sure are you that Intel would pull out a win on perf/W (or even a tie)?
I am not sure, I was just saying if the editor claims Sierra Forrest fails, that is the metrics he should have used to prove it.
Posted on Reply
#15
tabascosauz
OnasiI was genuinely sure that the biggest WX is a 4c part. Huh, guess not. Not sure why I was so convinced that it was. Mandela effect, I guess.
N4 Zen 4c V-F curve starts climbing quite early compared to N4 Zen 4. Both of them fabbed on N5 and being their respective CCD variants obviously won't be the exact same, but I'm pretty confident 4.8GHz would be far out of reach for either APU or CCD incarnation of Zen 4c.

Whether that's because CCD 4c runs into heat density issues, is incapable of clocking higher due to physical constraints, or Vcore requirements become prohibitive, who knows.
bugI am not sure, I was just saying if the editor claims Sierra Forrest fails, that is the metrics he should have used to prove it.
Fair
Posted on Reply
#16
_Flare
Yeah, it fails in regard to its geekbench scores, wich are kinda composit scores with weighting of for mayself unknown value to targeted cloud service providers.
The single core score is no surprise to me.
Regarding the multi core score, my opinion is that the inter core capability of those e-cores is known to be kinda bad/slow and the socket to socket penalty for this 144+144 core system is an additional limit vs one big Bergamo CPU.
Again, how is the value of those numbers to the target market, i dunno?

Regarding Zen4 to Zen4c, think of the "c" as for compactified, those cores are denser and thus have different electrical properties like lower sweetspot clockspeed.
Additionally one Bergamo CCD houses two CCX, each with 16MB L3 Cache like Zen2 had, but with 8 cores per CCX where Zen2 only had 4 cores per CCX.
Bergamo looks to me like having lower inter CCX capability than normal Zen4.
Because the IO-Die to Zen4c(CCD=twoCCX) have only the same perf as for Zen4(CCD=oneCCX)
Posted on Reply
#17
user556
Regarding wattage of the new Intel parts. I'd be surprised if they come in lower than AMD's parts, SP6 for the Zen4c parts = 225 W.
Compared to desktop ratings, Intel server parts do a lot better at sticking to the designated rating so using that would be comparing perf/W.

Agreed about Geekbench multicore scores petering out as the core count gets extreme. Although, the compared parts are both extreme so maybe still comparable.
Posted on Reply
#18
DavidC1
Tek-CheckThis SKU is going to show finally how much worse e cores are.
Intel 3 started manufacturing in H2 of 2023, which we are still in and Sierra Forest is based on that. Judging based on such an early sample will result in wrong conclusions.

I wouldn't be surprised if the top 144 core version closes even the low-thread gap over Bergamo using higher frequencies than Bergamo.
Posted on Reply
#19
usiname
DavidC1Intel 3 started manufacturing in H2 of 2023, which we are still in and Sierra Forest is based on that. Judging based on such an early sample will result in wrong conclusions.

I wouldn't be surprised if the top 144 core version closes even the low-thread gap over Bergamo using higher frequencies than Bergamo.
You should be very naive to belive that 144 skylake class cores without HT can come any close to 128 Zen 4c cores with SMT
Posted on Reply
#20
bug
user556Regarding wattage of the new Intel parts. I'd be surprised if they come in lower than AMD's parts, SP6 for the Zen4c parts = 225 W.
Compared to desktop ratings, Intel server parts do a lot better at sticking to the designated rating so using that would be comparing perf/W.

Agreed about Geekbench multicore scores petering out as the core count gets extreme. Although, the compared parts are both extreme so maybe still comparable.
I'm not even sure what does into the Geekbench score. If it includes games, web browsers or office software, that wouldn't be very relevant for a server chip.
Posted on Reply
#21
Steevo
The amount of butthurt of Intel fanbois ITT is inspiring.
Posted on Reply
#22
Squared
Haile SelassieI think that's a typo, isn't it? Should be lower clocks only, not cut down cache.
Actually, while Zen 4c in Phoenix 2 has the same L3 cache per core as we would expect from Zen 4, Bergamo does not have as much L3 cache per core as AMD's other server products.

After looking this up, Phoenix (Zen 4) maxes out at 16MB L3 / 8 cores, Phoenix 2 (Zen 4 + 4c) 16MB L3 / 2+4 cores, Genoa (Zen 4) 384MB / 96 cores (32MB / CCX), and Bergamo (Zen 4c) 256MB / 128 cores (16MB / CCX).
Posted on Reply
#23
demu
lemonadesodaBut we know that Geekbench 6 is very poor at assessing multicore cpus especially server CPUs and server workloads. Try Geekbench 5 instead. Or a different tool altogether. We’ve had this discussion many times before.
Yeah,
Old man once said: 'If you can't beat 'em, change the benchmark'.
:D
Posted on Reply
#24
bug
SteevoThe amount of butthurt of Intel fanbois ITT is inspiring.
Yes, because asking for some reliable measurements equals butthurt these days :kookoo:
Posted on Reply
#25
user556
bugIf it includes games, web browsers or office software, that wouldn't be very relevant for a server chip.
Very little of the above. It lists what it does each step of the way.
GB6:
Running File Compression
Running Navigation
Running HTML5 Browser
Running PDF Renderer
Running Photo Library
Running Clang
Running Text Processing
Running Asset Compression
Running Object Detection
Running Background Blur
Running Horizon Detection
Running Object Remover
Running HDR
Running Photo Filter
Running Ray Tracer
Running Structure from Motion


GB5:
Running AES-XTS
Running Text Compression
Running Image Compression
Running Navigation
Running HTML5
Running SQLite
Running PDF Rendering
Running Text Rendering
Running Clang
Running Camera
Running N-Body Physics
Running Rigid Body Physics
Running Gaussian Blur
Running Face Detection
Running Horizon Detection
Running Image Inpainting
Running HDR
Running Ray Tracing
Running Structure from Motion
Running Speech Recognition
Running Machine Learning
Posted on Reply
Add your own comment
Sep 7th, 2024 14:34 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts