• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel "Ice Lake" GPU Docs Reveal Unganged Memory Mode

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,670 (7.43/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
When reading through the Gen11 GT2 whitepaper by Intel, which describes their upcoming integrated graphics architecture, we may have found a groundbreaking piece of information that concerns the memory architecture of computers running 10 nm "Ice Lake" processors. The whitepaper mentions the chip to feature a 4x32-bit LPDDR4/DDR4 interface as opposed to the 2x64-bit LPDDR4/DDR4 interface of current-generation chips such as "Coffee Lake." This is strong evidence that Intel's new architecture will have unganged dual-channel memory controllers (2x 64-bit), as opposed to the monolithic 128-bit IMC found on current-generation chips.

An unganged dual-channel memory interface consists of two independent memory controllers, each handling a 64-bit wide memory channel. This approach lets the processor execute two operations in tandem, given the accesses go to distinct memory banks. On top of that it's now possible to read and write at the same time, something that's can't be done in 128-bit memory mode. From a processor's perspective DRAM is very slow, and what takes up most of the time (= latency), is opening the memory and preparing the read/write operation - the actual data transfer is fairly quick.



With two independent memory controllers these latencies can be mitigated, in several ways in unganged mode. While single-threaded workloads, or workloads that operate on a relatively small problem set, benefit more from ganged mode, unganged mode can shine when multiple (or multi-threaded) applications work with vast amounts of memory, which increases the likelihood that two independent banks of memory get accessed. Perhaps unganged-aware software, such as OS-level memory management could help make the most out of unganged mode, by trying to spread out processes evenly throughout the physical memory, so independent memory accesses can be executed as often as possible.

For integrated graphics, unganged mode is a real killer application though. The iGPU reserves a chunk of system memory for geometry, textures and framebuffer. This memory range is typically placed at the end of the physical memory space, whereas the Windows OS and applications usually are located near the start of physical memory. This effectively gives the GPU its own dedicated memory controller, which also reduces memory latency, because one controller can hold the IGP's memory pages open almost all the time, whereas the second controller takes care of the OS and application memory requests.

AMD has been supporting unganged dual-channel memory interfaces for over a decade now. The company's first Phenom processors introduced unganged memory with a BIOS option to force the CPU to interleave all data, called ganged mode. The consensus among the tech-community over the past ten years and the evolution of the modern processor toward more parallelism favors unganged mode. With CPU core counts heading north of 8 for mainstream-desktop processors, and integrated GPUs becoming the norm, it was natural for Intel to add support for an unganged memory interface.

Image Courtesy: ilsistemista.net

View at TechPowerUp Main Site
 
so amd approach was better and they do the same..

seems they try to improve the perf. this way also; if i read between lines .... they're aware of having perf. issues vs amd.... "Houston, we have a problem"
 
Intel got complacent. Now they are paying for it. MASSIVELY.
 
If the IGP is really worth a damn they found the same issue AMD faces, how to feed the high efficiency parallel shader cores fast enough to make them work while not starving your CPU cores.
 
so amd approach was better and they do the same..

seems they try to improve the perf. this way also; if i read between lines .... they're aware of having perf. issues vs amd.... "Houston, we have a problem"

it may explain how vega is doing so well in the apu's.
 
Come on, we all know Intel is straight up copy/pasting technology to quickly get in the higher end of GPUs. This can not be a surprise. Great minds think alike; or look in each others' garden.
 
Come on, we all know Intel is straight up copy/pasting technology to quickly get in the higher end of GPUs. This can not be a surprise. Great minds think alike; or look in each others' garden.

They might even try trickier stuff in the future. I don't trust this Raja dude and the Keller whats-his-face. They look like they might copy other people's design like Vega or Ryzen or something. Don't trust those, they look snakey
 
They might even try trickier stuff in the future. I don't trust this Raja dude and the Keller whats-his-face. They look like they might copy other people's design like Vega or Ryzen or something. Don't trust those, they look snakey

I would hope they would copy someone else's GPU.
 
They're really not though:

1yBacEu.png


AMD for reference:

JAg7q4b.png

Considering AMD has had ganged and unganged mode along with ECC definitely on AM3 and I believe even since AM2, yes Intel has been very complacent.
 
Cache memory solves that problem. Level one at CPU core, level two at cluster level...
Only cache misses have to be read from or written to memory.
I don't see as being a huge performance factor.
 
Cache memory solves that problem. Level one at CPU core, level two at cluster level...
Only cache misses have to be read from or written to memory.
I don't see as being a huge performance factor.

Caches are unfortunately not very useful for GPU architectures, they need a lot of instructions/data delivered all at once as opposed to a few instructions/data delivered very quickly as is the case with a CPU (that's a very primitive description but it's good enough).

They need a lot of bandwidth which is rather scarce on the current DDR4 platform, AMD faces the same problem.
 
Come on, we all know Intel is straight up copy/pasting technology to quickly get in the higher end of GPUs. This can not be a surprise. Great minds think alike; or look in each others' garden.
Reverse engineering is common, yet when the Chinese do it people crack a sad and spit the dummy over lost jobs and revenues.
 
Reverse engineering is common, yet when the Chinese do it people crack a sad and spit the dummy over lost jobs and revenues.
Going to assume you are joking, One is innovation, the other is espionage.
Anyone can steal someones entire IP and manufacture the design.
To figure out how it works, iterate on the design and compete... that is innovation.

Caches are unfortunately not very useful for GPU architectures, they need a lot of instructions/data delivered all at once as opposed to a few instructions/data delivered very quickly as is the case with a CPU (that's a very primitive description but it's good enough).

They need a lot of bandwidth which is rather scarce on the current DDR4 platform, AMD faces the same problem.

Yeah, the 2200g can keep pace with the 2400g when clocked the same despite the ~40% increase in sp, definitely memory starved.
 
Last edited:
Back
Top