• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Gen11 Architecture and GT2 "Ice Lake" iGPU Detailed

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
46,276 (7.69/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Intel "Ice Lake" will be the company's first major processor microarchitecture since the "Skylake" (2015), which promises CPU IPC improvements. Intel has been reusing both CPU cores and graphics architecture for four processor generations, since "Skylake". Gen9 got a mid-life update to Gen9.5 with "Kaby Lake", adding new display interfaces and faster drivers. "Ice Lake" takes advantage of the new 10 nm silicon fabrication process to not just pack faster CPU cores (with increased IPC), but also the new Gen11 iGPU. Intel published a whitepaper detailing this architecture.

An illustration in the whitepaper points to the GT2 trim of Gen11. GT2 tends to be the most common variant of each Intel graphics architecture. Gen9.5 GT2, for example, is deployed across the board on 8th and 9th generation Core processors (with the exception of the "F" or "KF" SKUs). The illustration confirms that Intel will continue to use their Ring Bus interconnect on the mainstream implementation of "Ice Lake" processors, despite possible increases in CPU core counts. This is slightly surprising, since Intel introduced a Mesh interconnect with its recent HEDT and enterprise processors. Intel has, however, ensured the iGPU has a preferential access to the Ring Bus, with 64 Byte/clock reads and 64 Byte/clock writes, while each CPU core only has 32 Byte/clock reads and 32 Byte/clock writes.



While the CPU core ring-stop terminates at its dedicated L2 cache, for the iGPU, it does so at a component called "GTI", short for graphics technology interface. The GTI interfaces with two components: Slice Common and a L3 cache which is completely separate from the processor's main L3 cache. The iGPU now has a dedicated 3 MB L3 cache, although the processor's main L3 cache outside the iGPU is still town-square for the entire processor. The iGPU's L3 cache cushions transfers between the GTI and Subslices. These are the indivisible number-crunching clusters of the GPU, much like streaming multiprocessors on an NVIDIA GPU - this is where the shaders are located. In addition to the subslices we find separate geometry processing hardware, and front-ends, including fixed-function hardware to accelerate media, which all feed into the eight subslices. The back-end is handled by "Slice Common", which includes ROPs, which write to the iGPU's own L3 cache.

Each Subslice begins with an instruction cache and thread dispatch that divides the number-crunching workload between eight execution units or EUs. Gen11 GT2 has 64 EUs, which is a 166% growth over the 24 EUs that we saw with Gen9.5 GT2 (for example on Core i9-9900K). Such a significant increase in EUs will probably double performance, to make up lost ground against AMD's Ryzen APUs. Each EU packs two ALUs with four execution pipelines each, register files, and a thread control unit. Certain other components are shared between the EUs, such as media samplers. Intel is updating the media engine of its integrated graphics to support hardware acceleration of more video formats, including 10-bpc VP9. The display controller now supports Panel Self Refresh, Display Context Save and Restore, VESA Adaptive-Sync, and support for USB-C based outputs.

View at TechPowerUp Main Site
 
Joined
Dec 14, 2013
Messages
2,603 (0.69/day)
Location
Alabama
Processor Ryzen 2700X
Motherboard X470 Tachi Ultimate
Cooling Scythe Big Shuriken 3
Memory C.R.S.
Video Card(s) Radeon VII
Software Win 7
Benchmark Scores Never high enough
I'm waiting to see the performance percentage increase and the pricetag with it.
That's all I can say ATM because I"m not expecting anything much different than what's been before.

If the percentage increase is good that would be great, esp if the pricetag for it doesn't amount to wallet-rape.
 
Joined
Feb 3, 2017
Messages
3,475 (1.33/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
Joined
Dec 14, 2013
Messages
2,603 (0.69/day)
Location
Alabama
Processor Ryzen 2700X
Motherboard X470 Tachi Ultimate
Cooling Scythe Big Shuriken 3
Memory C.R.S.
Video Card(s) Radeon VII
Software Win 7
Benchmark Scores Never high enough
That would be excellent to see, a real benefit derived from what competition does for the industry (And end user). Competition from AMD has had a good effect for everyone, pushes development and maybe we'll see some of the benefits of such with this release.

I'm still wondering about what the pricetag would be, hopefully good but that's something we have no real control over except by voting with our wallets come release time.
 
Joined
Feb 3, 2017
Messages
3,475 (1.33/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
24 EU Gen9.5 (UHD 630 and the ilk) is 192 shaders.
64 EU Gen11 is 512 shaders.
Architectural changes aside, this is over 2.5 times the compute power.
 
Joined
Nov 18, 2010
Messages
7,106 (1.46/day)
Location
Rīga, Latvia
System Name HELLSTAR
Processor AMD RYZEN 9 5950X
Motherboard ASUS Strix X570-E
Cooling 2x 360 + 280 rads. 3x Gentle Typhoons, 3x Phanteks T30, 2x TT T140 . EK-Quantum Momentum Monoblock.
Memory 4x8GB G.SKILL Trident Z RGB F4-4133C19D-16GTZR 14-16-12-30-44
Video Card(s) Sapphire Pulse RX 7900XTX + under waterblock.
Storage Optane 900P[W11] + WD BLACK SN850X 4TB + 750 EVO 500GB + 1TB 980PRO[FEDORA]
Display(s) Philips PHL BDM3270 + Acer XV242Y
Case Lian Li O11 Dynamic EVO
Audio Device(s) Sound Blaster ZxR
Power Supply Fractal Design Newton R3 1000W
Mouse Razer Basilisk
Keyboard Razer BlackWidow V3 - Yellow Switch
Software FEDORA 39 / Windows 11 insider
24 EU Gen9.5 (UHD 630 and the ilk) is 192 shaders.
64 EU Gen11 is 512 shaders.
Architectural changes aside, this is over 2.5 times the compute power.

They could lower the clock to tame the heat. So actually power envelope is the deciding factor. The perf increase thus can be lower than 2.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
26,956 (3.71/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
They could lower the clock to tame the heat. So actually power envelope is the deciding factor. The perf increase thus can be lower than 2.
This, and also workloads don't scale with cores linearly, which is part of GCN's problem.

I think ~2x is a reasonable estimate though
 
Joined
Dec 14, 2013
Messages
2,603 (0.69/day)
Location
Alabama
Processor Ryzen 2700X
Motherboard X470 Tachi Ultimate
Cooling Scythe Big Shuriken 3
Memory C.R.S.
Video Card(s) Radeon VII
Software Win 7
Benchmark Scores Never high enough
Have to agree - On paper it well could be 2.5x the computing power but will it actually deliver?
And final specs (Clockspeeds) I'd have to think aren't exactly set in stone just yet, esp if during development they run into problems like before and are forced to tweak things so it works without issues in the end.

Much work to be done yet with it.
 
Joined
Mar 31, 2014
Messages
1,533 (0.42/day)
Location
Grunn
System Name Indis the Fair (cursed edition)
Processor 11900k 5.1/4.9 undervolted.
Motherboard MSI Z590 Unify-X
Cooling Heatkiller VI Pro, VPP755 V.3, XSPC TX360 slim radiator, 3xA12x25, 4x Arctic P14 case fans
Memory G.Skill Ripjaws V 2x16GB 4000 16-19-19 (b-die@3600 14-14-14 1.45v)
Video Card(s) EVGA 2080 Super Hybrid (T30-120 fan)
Storage 970EVO 1TB, 660p 1TB, WD Blue 3D 1TB, Sandisk Ultra 3D 2TB
Display(s) BenQ XL2546K, Dell P2417H
Case FD Define 7
Audio Device(s) DT770 Pro, Topping A50, Focusrite Scarlett 2i2, Røde VXLR+, Modmic 5
Power Supply Seasonic 860w Platinum
Mouse Razer Viper Mini, Odin Infinity mousepad
Keyboard GMMK Fullsize v2 (Boba U4Ts)
Software Win10 x64/Win7 x64/Ubuntu
Unless they are really far behind the curve from the efficiency standpoint, memory bandwidth will probably be more restrictive than anything else. You'd be surprised how many laptops only use a single memory channel.

I wonder if they will revisit eDRAM or maybe have a go atHBM any time soon. Alternatively 3-channel memory could also be an option.
 
Joined
Apr 12, 2013
Messages
1,187 (0.30/day)
Processor 11700
Motherboard TUF z590
Memory G.Skill 32gb 3600mhz
Video Card(s) ROG Vega 56
Case Deepcool
Power Supply RM 850
Interesting is the new instruction set AVX 512 it could give a great performance boost to video editing.
 
Joined
Apr 12, 2013
Messages
6,728 (1.68/day)
Have to agree - On paper it well could be 2.5x the computing power but will it actually deliver?
And final specs (Clockspeeds) I'd have to think aren't exactly set in stone just yet, esp if during development they run into problems like before and are forced to tweak things so it works without issues in the end.

Much work to be done yet with it.
Yes but extra compute power doesn't always translate into gaming/graphics as we've seen with AMD. There are other bottlenecks including bandwidth & the underlying uarch, having said that L1 & L2 changes made a huge difference for Nvidia - maybe that'll be enough for Intel to complete?
 
Joined
Feb 3, 2017
Messages
3,475 (1.33/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
having said that L1 & L2 changes made a huge difference for Nvidia - maybe that'll be enough for Intel to complete?
Only L3 cache is bigger, rest are exactly the same. In fact, there do not appear to be that many changes on the GPU side of things. Largely the same EUs, same caches (except the larger L3). The only major change to GPU itself is the added EUs and twice the ROPs?
 
Joined
Sep 17, 2014
Messages
20,776 (5.97/day)
Location
The Washing Machine
Processor i7 8700k 4.6Ghz @ 1.24V
Motherboard AsRock Fatal1ty K6 Z370
Cooling beQuiet! Dark Rock Pro 3
Memory 16GB Corsair Vengeance LPX 3200/C16
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Samsung 850 EVO 1TB + Samsung 830 256GB + Crucial BX100 250GB + Toshiba 1TB HDD
Display(s) Gigabyte G34QWC (3440x1440)
Case Fractal Design Define R5
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse XTRFY M42
Keyboard Lenovo Thinkpad Trackpoint II
Software W10 x64
Article speaks of IPC improvements, but I have a hard time identifying those for CPU tasks. Sure, the GPU will be more zippy, but we ain't got time for that low-end junk.

Bottom line I think is a minor clock bump for CPU along with substantial IGP improvements. Broadwell v2... Inb4 another 5775C that will be a rare unicorn in the wild. Given Intel's 10nm woes, that seems plausible...
 
Joined
Feb 3, 2017
Messages
3,475 (1.33/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
Sure, the GPU will be more zippy, but we ain't got time for that low-end junk.
The GPU configuration they present as Gen11 GT2 is roughly equal to the current Vega8. Intel GT2 is in a lot of laptops.
 
Joined
Sep 17, 2014
Messages
20,776 (5.97/day)
Location
The Washing Machine
Processor i7 8700k 4.6Ghz @ 1.24V
Motherboard AsRock Fatal1ty K6 Z370
Cooling beQuiet! Dark Rock Pro 3
Memory 16GB Corsair Vengeance LPX 3200/C16
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Samsung 850 EVO 1TB + Samsung 830 256GB + Crucial BX100 250GB + Toshiba 1TB HDD
Display(s) Gigabyte G34QWC (3440x1440)
Case Fractal Design Define R5
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse XTRFY M42
Keyboard Lenovo Thinkpad Trackpoint II
Software W10 x64
The GPU configuration they present as Gen11 GT2 is roughly equal to the current Vega8. Intel GT2 is in a lot of laptops.

Oh don't get me wrong, sure there's a market, its just not me, or any enthusiast I reckon. The IGP will never become fast enough to truly rival dedicated cards, those goal posts move every release anyway and there is a fundamental TDP problem with APUs.
 
Joined
Apr 12, 2013
Messages
6,728 (1.68/day)
In fact, there do not appear to be that many changes on the GPU side of things
What about GTI bandwidth (W 64B vs 32B previously) & probably LLC cache as well, the pixels/clock & HiZ Zixel's/clock has also doubled?
From ~ https://software.intel.com/sites/de...e-of-Intel-Processor-Graphics-Gen11_R1new.pdf
In Gen11, the Z buffer min/max is back annotated into HiZ buffer reducing future nondeterministic or ambiguous tests. When HiZ buffer does not have visibility data till post shader, the resulting tests are nondeterministic in HiZ resulting in Z to per pixel testing. Back annotation allows updating the HiZ buffer with results from Z buffer as shown in figure 6. HiZ test range is narrowed, resulting in coarse testing instead of pixel level for normal rendering or per sample level when MSAA is enabled. Thus, the overall depth test throughput is increased while the corresponding Z memory BW is simultaneously decreased.

4.4.3 Pixel Dispatch
The Pixel Dispatch block accumulates subspans/pixel information and dispatches threads to the execution units. The pixel dispatcher, decides the SIMD width of the thread to be executed, choosing between SIMD8, SIMD16 and SIMD32. Pixel Dispatch chooses this to maximize execution efficiency and utilization of the register file. The block load balances across the shader units and ensures order in which pixels retire from the shader units. In Gen11, pixel dispatch includes the function of “coarse pixel shader” which is described in detail in Sections 5.1. When CPS is enabled, the coarse pixels generated are packed which reduces the number of pixel shading invocations. The reference or the mapping of a coarse pixel to pixel is maintained until the pixel shader is executed.

4.4.4 Pixel Backend/Blend
The Pixel Backend (PBE) is the last stage of the rendering pipeline which includes the cache to hold the color values. This pipeline stage also handles the color blend functions across several source and destination surface formats. Lossless color compression is handled here as well. Intel® Processor Graphics Gen11 Architecture
Gen11 exploits use of lower precision in render target formats to reduce power for blending operations.

4.4.5 Level-3 Data Cache
In Gen11, the L3 data cache capacity has been increased to 3MB. Each application context has flexibility as to how much of the L3 memory structure is allocated in:  Application L3 data cache  System buffers for fixed-function pipelines. For example, 3D rendering contexts often allocate more L3 as system buffers to support their fixed-function pipelines. All sampler caches and instruction caches are backed by L3 cache. The interface between each Dataport and the L3 data cache enables both read and write of 64 bytes per cycle. Z, HiZ, Stencil and color buffers may also be backed in L3 specifically when tiling is enabled. In typical 3D/Compute workloads, partial access is common and occurs in batches and makes ineffective use of memory bandwidth. In Gen11, when accessing memory, L3 cache opportunistically combines partial access of a pair of 32B to a single 64B thereby improving efficiency.

4.5 MEMORY

4.5.1 Memory Efficiency Improvements
Intel® processor graphics architecture continuously invests in technologies which improve graphic memory efficiency besides improving raw unified memory bandwidth.
Gen9 architecture introduced lossless compression of both render targets and dynamic textures. Games tend to have a lot of render to texture cases where the intermediate rendered buffer is used as a texture in subsequent drawcalls within a frame. As games target higher quality visuals, the bandwidth used by dynamic textures as well as higher resolution becomes increasingly important. Lossless compression aims to mitigate this by taking advantage of the fact that adjacent pixel blocks within a render target vary slowly or are similar which exposes opportunity for compression. Compression yields write bandwidth savings when the data is evicted from L3 cache to memory as well as for read bandwidth savings in case of dynamic textures or alpha blending of surfaces. These improvements results in additional power savings.
Gen11 enables two new optimizations to lossless color compression:  Support for sRGB surface formats for dynamic textures. Use of gamma corrected color space is important especially as the usage of high dynamic range is increasing. Intel® Processor Graphics Gen11 Architecture 18  The compression algorithm exploits the property that a group of pixels can have the same color when shaded using coarse pixel shading as discussed in section 5.1.
Additionally, memory efficiency is further improved by tile based rendering technology (PTBR) discussed in section 5.2. Fundamentally, it makes the render target and depth buffer stay on chip memory during the render pass while overdraws are collapsed. There are opportunities to discard temporary surfaces by not writing back to memory. PTBR additionally improves sampler access locality and makes on chip cache hierarchy more efficient.

4.5.2 Unified Memory Architecture
Intel® processor graphics architecture has long pioneered sharing DRAM physical memory with the CPU. This unified memory architecture offers a number of system design, power efficiency, and programmability advantages over PCI Express-hosted discrete memory systems.
The obvious advantage is that shared physical memory enables zero copy buffer transfers between CPUs and Gen11 compute architecture. By zero copy, we mean that no buffer copy is necessary since the physical memory is shared. Moreover, the architecture further augments the performance of such memory sharing with a shared LLC cache. The net effect of this architecture benefits performance, conserves memory footprint, and indirectly conserves system power not spent needlessly copying data. Shared physical memory and zero copy buffer transfers are programmable through the buffer allocation mechanisms in APIs such as Vulkan™*, OpenCL2™* and DirectX12™*.
Gen11 supports LPDDR4 memory technology capable of delivering much higher bandwidth than previous generations. The entire memory sub-system is optimized for low latency and high bandwidth. Gen11 memory sub-system features several optimizations including fabric routing policies, and enhanced memory controller scheduling algorithms which increases overall memory bandwidth efficiency. The memory sub-system also includes QOS features that help balance bandwidth demands from multiple high-bandwidth agents.
 
Last edited:
Joined
Feb 3, 2017
Messages
3,475 (1.33/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
As I mentioned besides EU count and twice the ROPs. Twice the ROPs sounds interesting though considering 2.5 times the shaders. Intel seems to be playing a little with resource balance.
Really not sure about the GTI bandwidth, how important is write for a GPU?
 
Joined
Apr 12, 2013
Messages
6,728 (1.68/day)
In terms of the basic uarch there doesn't seem to be any major change, with memory there is - compression, UMA , bigger L3 & possibly LLC as well. I wouldn't be surprised if the IGP performance increased 2x across the board given there's no TDP constraints.
 
Joined
Feb 3, 2017
Messages
3,475 (1.33/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
Twice the ROPs and improved cache-memory system is clearly to back up the increased amount of execution units.
2x in terms of efficiency/IPC or 2x from Gen9.5 GT2 to Gen11 GT2? I do not believe it will do twice the efficiency. On the other hand, Gen11 GT2 having twice the performance of Gen9.5 GT2 would be slightly disappointing.
 
Joined
Apr 12, 2013
Messages
6,728 (1.68/day)
I didn't say 2x the efficiency, that's hard to measure anyway given it's an IGP. However if they throw 2x (or more) in terms of resources, then coupled with the memory changes the actual performance may well be 2x or thereabouts. It'll still lag AMD & Nvidia, the latter by a huge margin, but could be an indication into how Intel designs their future APUs or even dGPU wrt the established duopoly.
 
Joined
Jun 10, 2014
Messages
2,889 (0.81/day)
Processor AMD Ryzen 9 5900X ||| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ||| Asus P9X79 WS
Cooling Noctua NH-U14S ||| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ||| Corsair 8 x 8 GB 1333 MHz
Video Card(s) MSI GTX 1060 3GB ||| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ||| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ||| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
Article speaks of IPC improvements, but I have a hard time identifying those for CPU tasks. Sure, the GPU will be more zippy, but we ain't got time for that low-end junk.
I think like most people here, I don't care at all about the integrated graphics at all, all of us will be running dedicated graphics anyway.

On the other hand, Ice Lake/Sunny Cove is very interesting. It will be the first architectural improvement in 4 years, and Intel have promised improvements for both "single thread" and "ISA", whatever that implies.

We don't have any solid information on the performance characteristics of Sunny Cove yet, and while I'm not expecting huge improvements, I'm pretty sure it will be distinct improvement in IPC. We don't know any of the specs of the front-end of the CPU yet, but we do know Sunny Cove features significant changes in L1/L2 cache configurations and bandwidth. On the execution side it features over double integer mul/div performance (but no changes to ALUs), along with large improvements in load/store bandwidth and memory address calculations. Sunny Cove is clearly engineered for higher throughput, but what that means in terms of IPC gains is hard to tell, especially since we don't know the details of the important front-end which feeds this "beast".
 
Joined
Jan 8, 2017
Messages
8,860 (3.36/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
The real question is, how will this end up in their chips because this new design will take up a lot more space. This is screaming for 10nm.
 
Top