• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Demos 6th Gen Xeon Scalable CPUs, Core Counts Leaked

T0@st

News Editor
Staff member
Joined
Mar 7, 2023
Messages
2,077 (4.76/day)
Location
South East, UK
Intel's advanced packaging prowess demonstration took place this week—attendees were able to get an early-ish look at Team Blue's sixth Generation Xeon Scalable "Sapphire Rapids" processors. This multi-tile datacenter-oriented CPU family is projected to hit the market within the first half of 2024, but reports suggest that key enterprise clients have recently received evaluation samples. Coincidentally, renowned hardware leaker—Yuuki_AnS—has managed to source more information from industry insiders. This follows their complete blowout of more mainstream Raptor Lake Refresh desktop SKUs.

The leaked slide presents a bunch of evaluation sample "Granite Rapids-SP" XCC and "Sierra Forest" HCC SKUs. Intel has not officially published core counts for these upcoming "Avenue City" platform product lines. According to their official marketing blurb: "Intel Xeon processors with P-cores (Granite Rapids) are optimized to deliver the lowest total cost of ownership (TCO) for high-core performance-sensitive workloads and general-purpose compute workloads. Today, Xeon enables better AI performance than any other CPU, and Granite Rapids will further enhance AI performance. Built-in accelerators give an additional boost to targeted workloads for even greater performance and efficiency."




The more frugal family is described as: "Intel Xeon processors with E-cores (Sierra Forest) are enhanced to deliver density-optimized compute in the most power-efficient manner. Xeon processors with E-cores provide best-in-class power-performance density, offering distinct advantages for cloud-native and hyperscale workloads."

The leaked information suggests that listed "Granite Rapids-SP" ES1 units max out at 56 cores along with 288 MB of cache on an eight-channel memory subsystem carrying two chiplets. It is possible that each tile carries either 28 or 30 cores, and two cores per chiplet being disabled for redundancy purposes. Final production processors could up the ante to around 84 - 90 cores. A Tom's Hardware analysis of Yuuki_AnS's slide proposes that: "the compute chiplets are made on Intel 3 (3 nm-class) process technology, whereas HSIO chiplets are fabbed on a 7 nm-class production node, which is a proven technology and is considered to be optimal for modern I/O chiplets in terms of performance and costs."

View at TechPowerUp Main Site | Source
 
Joined
Mar 13, 2021
Messages
398 (0.34/day)
Processor AMD 7600x
Motherboard Asrock x670e Steel Legend
Cooling Silver Arrow Extreme IBe Rev B with 2x 120 Gentle Typhoons
Memory 4x16Gb Patriot Viper Non RGB @ 6000 30-36-36-36-40
Video Card(s) XFX 6950XT MERC 319
Storage 2x Crucial P5 Plus 1Tb NVME
Display(s) 3x Dell Ultrasharp U2414h
Case Coolermaster Stacker 832
Power Supply Thermaltake Toughpower PF3 850 watt
Mouse Logitech G502 (OG)
Keyboard Logitech G512
Interesting to see the shift away from SMT due to all the vulernabilites that have been discovered in recent years.
 

Wye

Joined
Feb 15, 2023
Messages
199 (0.44/day)
It looks like pretty low frequency: min 1.2-1.6, max 2.4-2.7 Ghz.
Single threaded flows would have low performance.
 
Joined
Aug 30, 2006
Messages
7,199 (1.11/day)
System Name ICE-QUAD // ICE-CRUNCH
Processor Q6600 // 2x Xeon 5472
Memory 2GB DDR // 8GB FB-DIMM
Video Card(s) HD3850-AGP // FireGL 3400
Display(s) 2 x Samsung 204Ts = 3200x1200
Audio Device(s) Audigy 2
Software Windows Server 2003 R2 as a Workstation now migrated to W10 with regrets.
I am not happy to see TDP so high as 350W
 
Joined
Mar 18, 2023
Messages
610 (1.44/day)
System Name Never trust a socket with less than 2000 pins
It looks like pretty low frequency: min 1.2-1.6, max 2.4-2.7 Ghz.
Single threaded flows would have low performance.

You are supposed to have applications that these Xeons contain special accelerators for.

(I don't except for the new zstd compression)
 

Toothless

Tech, Games, and TPU!
Supporter
Joined
Mar 26, 2014
Messages
9,312 (2.51/day)
Location
Washington, USA
System Name Veral
Processor 5950x
Motherboard MSI MEG x570 Ace
Cooling Corsair H150i RGB Elite
Memory 4x16GB G.Skill TridentZ
Video Card(s) Powercolor 7900XTX Red Devil
Storage Crucial P5 Plus 1TB, Samsung 980 1TB, Teamgroup MP34 4TB
Display(s) Acer Nitro XZ342CK Pbmiiphx + 2x AOC 2425W
Case Fractal Design Meshify Lite 2
Audio Device(s) Blue Yeti + SteelSeries Arctis 5 / Samsung HW-T550
Power Supply Corsair HX850
Mouse Corsair Nightsword
Keyboard Corsair K55
VR HMD HP Reverb G2
Software Windows 11 Professional
Benchmark Scores PEBCAK
It looks like pretty low frequency: min 1.2-1.6, max 2.4-2.7 Ghz.
Single threaded flows would have low performance.
Built for multi-core workloads. Cores used>speed
 
Joined
May 3, 2018
Messages
2,351 (1.07/day)
Interesting to see the shift away from SMT due to all the vulernabilites that have been discovered in recent years.
That's not why they are doing this? They cannot get SMT to work with the new chiplet designs. It won't feature in Arrow Lake either but they say it may come back later post Luna Lake. The new architecture and chiplets and substrate etc are already proving enough of a headache for them. Thye are still claiming Arrow Lake will beat Raptor Lake in mulithreaded apps despite losing SMT, but I'm not sure if that's only for the halo 8P+32E variant or all varaints like-for-like.
 
Joined
Jun 29, 2018
Messages
467 (0.22/day)
It looks like pretty low frequency: min 1.2-1.6, max 2.4-2.7 Ghz.
Single threaded flows would have low performance.
I am pretty certain those specific clocks are due to the "ES1" suggesting an engineering sample.
I would be very surprised to see those clocks in the final product, especially since AMD can do way better:
This was even in a less-than-stellar cooling server where the temps were in the 75C range, yet all 256 threads were loaded and all 128 cores sat at 3.1GHz.
This is a next-next product manufactured on a completely new node, so it's expected that early samples have lower clocks.

Interesting to see the shift away from SMT due to all the vulernabilites that have been discovered in recent years.
This is only for the E-core based Xeons while P-core ones will have SMT.
That's not why they are doing this? They cannot get SMT to work with the new chiplet designs. It won't feature in Arrow Lake either but they say it may come back later post Luna Lake. The new architecture and chiplets and substrate etc are already proving enough of a headache for them. Thye are still claiming Arrow Lake will beat Raptor Lake in mulithreaded apps despite losing SMT, but I'm not sure if that's only for the halo 8P+32E variant or all varaints like-for-like.
I don't think it's related to having chiplets since the current Sapphire Rapids Xeons are also chiplet-based and feature SMT. The SMT-less E-core Xeons are targeted towards a specific segments - mostly cloud computing for which it is not a desirable feature. AMD also has the EPYC 9754S with factory-disabled SMT which I find unusual due to the fact you can already disable SMT in BIOS. Not that it matters much since cloud vendors get specific off-market SKUs anyway.
 
Joined
Jan 3, 2021
Messages
2,751 (2.24/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
I would be very surprised to see those clocks in the final product, especially since AMD can do way better
Not a fair comparison. At least in current products, AMD's Zen 4c core is 2/3 the size of a Zen 4 core while Intel's E core is 1/3 the size of a P core.
 
Joined
Jun 29, 2018
Messages
467 (0.22/day)
Not a fair comparison. At least in current products, AMD's Zen 4c core is 2/3 the size of a Zen 4 core while Intel's E core is 1/3 the size of a P core.
I'm not sure what you're getting at. Zen 4c is not directly comparable to Intel E-cores either since it retains all the features of Zen 4. It's a space-optimized version of the same architecture while E-cores employ a completely different design.
Both Intel 4th gen Xeons and Zen 4 EPYCs have higher clocks than what this ES1 presents.
My point was that this is just an engineering sample so the clocks shouldn't be taken as final. I probably shouldn't have compared it to AMD but to Intel's current gen, however that was the only solid source of raw clocks I remembered at the time. It's not something tested often.
 
Joined
Jan 3, 2021
Messages
2,751 (2.24/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
I'm not sure what you're getting at. Zen 4c is not directly comparable to Intel E-cores either since it retains all the features of Zen 4. It's a space-optimized version of the same architecture while E-cores employ a completely different design.
Both Intel 4th gen Xeons and Zen 4 EPYCs have higher clocks than what this ES1 presents.
My point was that this is just an engineering sample so the clocks shouldn't be taken as final. I probably shouldn't have compared it to AMD but to Intel's current gen, however that was the only solid source of raw clocks I remembered at the time. It's not something tested often.
Both 4c and E are optimised for space, and the result of this optimisation (performance per mm² gained vs large cores) will probably be similar between AMD's and Intel's approach. That was the whole point of my comment.
 
Joined
Jun 29, 2018
Messages
467 (0.22/day)
Both 4c and E are optimised for space, and the result of this optimisation (performance per mm² gained vs large cores) will probably be similar between AMD's and Intel's approach. That was the whole point of my comment.
It's not going to be similar because E-cores are not just smaller P-cores. They do not have the same microarchitectures. E-cores are missing both AVX-512 and AMX when compared to 3rd gen Xeon Scalable. If your workload can utilize AMX then even going back to AVX-512 is going to decrease performance dramatically. Further back to AVX2, which is what E-core Sierra Forest will have, yields even bigger loss of performance. The cache structure is also significantly different between them. There are more differences in core design as well.
On the other hand Zen 4c is the same core as Zen 4 which has the same capabilities just with less cache, slower frequency, and a slightly different structure due to having 2 CCXs on the CCD.
Your metric of perf per mm² gained can be easily calculated for Zen 4c, but for E-cores it's significantly harder due to its differences. It might work for workloads not utilizing anything above AVX2, but even then the cache structure complicates MT measurements.
 
Top