Wednesday, March 27th 2024

Intel Lunar Lake Chiplet Arrangement Sees Fewer Tiles—Compute and SoC

Intel Core Ultra "Lunar Lake-MX" will be the company's bulwark against Apple's M-series Pro and Max chips, designed to power the next crop of performance ultraportables. The MX codename extension denotes MoP (memory-on-package), which sees stacked LPDDR5X memory chips share the package's fiberglass substrate with the chip, to conserve PCB footprint, and give Intel greater control over the right kind of memory speed, timings, and power-management features suited to its microarchitecture. This is essentially what Apple does with its M-series SoCs powering its MacBooks and iPad Pros. Igor's Lab scored the motherlode on the way Intel has restructured the various components across its chiplets, and the various I/O wired to the package.

When compared to "Meteor Lake," the "Lunar Lake" microarchitecture sees a small amount of "re-aggregation" of the various logic-heavy components of the processor. On "Meteor Lake," the CPU cores and the iGPU sat on separate tiles—Compute tile and Graphics tile, respectively, with a large SoC tile sitting between them, and a smaller I/O tile that serves as an extension of the SoC tile. All four tiles sat on top of a Foveros base tile, which is essentially an interposer—a silicon die that facilitates high-density microscopic wiring between the various tiles that are placed on top of it. With "Lunar Lake," there are only two tiles—the Compute tile, and the SoC tile.
The Compute tile contains the CPU cores, and is built on the TSMC N3B foundry node. This is a 3 nm EUV node, and is said to be more advanced than the Intel 4 node that the company used for the Compute tile of "Meteor Lake." The processor pictured above features a 4P+4E CPU core configuration. That's four "Lion Cove" P-cores, and four "Skymont" E-cores. Considering that Intel used 2P+8E CPU core configurations in the 7 W and 15 W U-segments for "Alder Lake," "Lunar Lake-MX" is a step-up in terms of CPU performance, even if the CPU core count itself is down by 2. It's worth noting here, that the "Lion Cove" P-cores do not feature HTT, and so the logical processor count is 8.

At this point, it is not clear where the iGPU resides. On "Meteor Lake," the iGPU had its own tile called the Graphics tile, the processor's memory controllers were located on the SoC tile along with the NPU; and both the iGPU and CPU cores would rely on the SoC tile for memory access. An iGPU is as much a memory latency-sensitive device as an NPU, and so, one theory holds that the iGPU of "Lunar Lake" is located in the SoC tile along with the NPU, for the least possible latency to the memory controllers; the other less likely theory is that it's located on the 3 nm Compute tile.

This is because with Igor's confirmation that the Compute tile is built on TSMC N3B, and Intel Foundry slides mentioning the debut of the Intel 18A node with "Lunar Lake," it is the SoC tile that's built on Intel 18A. Logically, Intel 18A should be more advanced than TSMC N3B, since it incorporates nanosheet transistors, EUV lithography, and higher transistor densities. It hence makes more sense for Intel to leave the iGPU and NPU on the SoC tile, along with the memory controllers.

As for the iGPU itself, Igor's Lab confirms Intel's use of its next-generation Xe2-LPG "Battlemage" graphics architecture, which promises a generational leap in performance over Xe-LPG "Alchemist." The iGPU is expected to meet the full DirectX 12 Ultimate feature-set, as well as offer XeSS. The iGPU model on "Lunar Lake-MX" comes with 64 EU (execution units), and 8 Xe2 cores; although other variants of "Lunar Lake" could come with larger iGPUs.

Lastly, we move onto the bountiful I/O of "Lunar Lake-MX," starting with its LPDDR5X memory interface for on-package LPDDR5X-8533 memory. The package puts out both PCI-Express Gen 5 and Gen 4 lanes. The Gen 5 lanes are ideally meant for the chip's PEG interface, and a discrete GPU, and a Gen 5 NVMe SSD; while the Gen 4 lanes are meant to be general purpose—for secondary NVMe SSDs, or other onboard discrete devices. The processor has an integrated Wi-Fi 7 + Bluetooth 6 WLAN controller, and MACs for wired GbE controllers.

The main high-bandwidth connector from the "Lunar Lake-MX" is the chip's integrated Thunderbolt 4 interface, and USB4 interface, which can be configured by OEMs for power delivery. The chip also puts out regular USB 3.2 and USB 2.0 connectivity. Display connectivity includes DisplayPort 2.1, HDMI 2.1, and eDP 1.5. The DisplayPort 2.1 can be multiplexed with USB-C ports.
Source: Igor's Lab
Add your own comment

9 Comments on Intel Lunar Lake Chiplet Arrangement Sees Fewer Tiles—Compute and SoC

#1
KellyNyanbinary
Looking at the size of the three tiles, the iGPU is probably in the largest tile - the SoC tile.
Posted on Reply
#2
Wirko
Why are the LPDDR stacks packaged in plastic, not bare like HBM?
Posted on Reply
#3
LabRat 891
KellyNyanbinaryLooking at the size of the three tiles, the iGPU is probably in the largest tile - the SoC tile.
Considering msft is adopting the name "NPU" for AI/MI hardware acceleration, I see a shift happening.
(GP)GPU functionality is and will continue to become more important (and more die space)
WirkoWhy are the LPDDR stacks packaged in plastic, not bare like HBM?
They're off-the-shelf LPDDR5X. Makes sense to me. No need for a hyper-dense interposer, no need for special-fab'd HBM.
Posted on Reply
#4
Wirko
LabRat 891Considering msft is adopting the name "NPU" for AI/MI hardware acceleration, I see a shift happening.
(GP)GPU functionality is and will continue to become more important (and more die space)

They're off-the-shelf LPDDR5X. Makes sense to me. No need for a hyper-dense interposer, no need for special-fab'd HBM.
Logistics is simpler with standard parts, sure. Also, as this is "low power" memory, it might not need the best possible cooling. However, I dont see a dense interposer as a requirement - if the processor with ~1700 contact points can connect to the substrate, so could a LPDDR stack with a couple hundred contacts. It's not comparable to HBM, which needs 1024 contacts for data bus alone.
Posted on Reply
#5
sLowEnd
I'd like to see GN do a cooler mounting pressure test on this
Posted on Reply
#6
Minus Infinity
How is a 17W U class 4+4 core apu a bulwark against the pro and max M3 chips. That literally makes no sense at all as they are in no way shape or form targeting that segment of the market. At best it would be a bulwark against the regular M3 in a Macbook Air.
Posted on Reply
#7
Scrizz
WirkoLogistics is simpler with standard parts, sure. Also, as this is "low power" memory, it might not need the best possible cooling. However, I dont see a dense interposer as a requirement - if the processor with ~1700 contact points can connect to the substrate, so could a LPDDR stack with a couple hundred contacts. It's not comparable to HBM, which needs 1024 contacts for data bus alone.
LP DDR is way cheaper than HBM
Posted on Reply
#8
persondb
btarunrAt this point, it is not clear where the iGPU resides. On "Meteor Lake," the iGPU had its own tile called the Graphics tile, the processor's memory controllers were located on the SoC tile along with the NPU; and both the iGPU and CPU cores would rely on the SoC tile for memory access. An iGPU is as much a memory latency-sensitive device as an NPU, and so, one theory holds that the iGPU of "Lunar Lake" is located in the SoC tile along with the NPU, for the least possible latency to the memory controllers; the other less likely theory is that it's located on the 3 nm Compute tile.
CPUs are latency constrained, not GPUs and NPUs actually. Those depend more on bandwidth and they can hide latency better with their parallelism.

The bad thing about the memory in package is that honestly, Intel isn't using it to offer greater memory busses which I believe is the biggest limitation of modern x86, being forever stuck in 128-bits for consumer side. There is also more opportunities for power saving when it's on the package too.

Apple has done bigger memory busses but AMD and Intel refuses to follow them.

Though there is a drawback of not being upgradeable at all, though LPDDR was already not upgradeable, but even the mobile OEMs will be stuck with what choices Intel end up offering, not being able to do mobile devices with different memory layouts.
Posted on Reply
#9
Wirko
ScrizzLP DDR is way cheaper than HBM
Sure, I didnt mention HBM as a possibility, I only mentioned hypothetical bare LPDDR stacks, not packaged in black plastic/resin.

And even those must be expensive because they require TSV stacking. You see the same in VERY expensive server DIMMs (128 GB+) where high density couldn't be achieved otherwise. On the other hand, NAND dies in SSDs are connected by plain old cheap wire-bonding process, which apparently is still OK for their 2400 MT/s DDR data bus.
Posted on Reply
Apr 27th, 2024 02:05 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts