Intel "Nova Lake‑AX" APU Enhances iGPU Performance with Plenty of Xe3 Cores

AleksandarK · Wednesday at 9:18 AM

Intel is reportedly preparing "Nova Lake-AX," a high-end laptop SoC that combines a massive 52-core CPU complex with an expanded Xe3 graphics tile. While the standard Nova Lake‑S is set to arrive in 2026, followed by the H and HX mobile variants, the AX model will debut later as the flagship SKU. Built on Intel's second‑generation Foveros technology, Nova Lake‑AX stacks two compute tiles, each one with eight "Coyote Cove" P‑cores and 16 "Arctic Wolf" E‑cores, and a separate low‑power island with four LPE cores. A separate cache-boosted passive tile could add over 100 MB of Last Level Cache (bLLC), which feeds both the CPU cores and the Celestial Xe3 iGPU, potentially scaling up to 20-24 Xe3 cores. Intel already uses bLLC in its Clearwater Forest server processors as a passive interposer, where local cache integrates beneath active tiles, so its inclusion here could deliver a significant performance uplift.

Unlike the regular Nova Lake‑S, H, and HX variants, which are expected to feature half as many Xe3 cores or fewer, the AX model promises a truly gaming‑worthy APU experience with a massive iGPU. With a combined TDP approaching 150 W in mobile workstations, active cooling will be mandatory in all form factors. Intel clearly aims to challenge AMD's long‑standing lead in the APU space. Historically, Intel CPUs have shipped with modest iGPUs, while AMD's APUs have offered solid integrated performance capable of driving many games at 1080p and even 1440p. Nova Lake‑AX pairs substantial CPU horsepower with a massive Xe3 graphics engine, marking Intel's most aggressive entry yet for enthusiast and gaming laptops. As Nova Lake news ramps up, we expect more details, such as clock speeds, exact Xe3 core count, and pricing, to emerge in the coming months.

View at TechPowerUp Main Site | Source

londiste · Wednesday at 9:57 AM

The math is not mathing here.
Up to 52-core is the Nova Lake-HX/S with 2 compute tiles of each type plus 4 LPE ones - 8*2 + 16*2 + 4 = 52.
Nova Lake-AX is 1 compute tile of each type - 8 + 16 + 4 = 28.
Edit: looks like I read the sources incorrectly, my bad.

The 100MB bLLC might be the real headliner here.

AleksandarK · Wednesday at 10:20 AM

londiste said:
The math is not mathing here.
Up to 52-core is the Nova Lake-HX/S with 2 compute tiles of each type plus 4 LPE ones - 8*2 + 16*2 + 4 = 52.
Nova Lake-AX is 1 compute tile of each type - 8 + 16 + 4 = 28.

The 100MB bLLC might be the real headliner here.

Yeah two of each. 2×(8+16)+4LPE on a separate island.

Sol_Badguy · Wednesday at 10:41 AM

@londiste

If I'm not mistaken the four LPE cores are not on the compute tile. Therefore the math adds up.

Mack4285 · Wednesday at 11:22 AM

Isn't really fast dedicated/unified memory needed to make most of these APU:s?

wNotyarD · Wednesday at 11:29 AM

Mack4285 said:
Isn't really fast dedicated/unified memory needed to make most of these APU:s?

That large slab of cache can go a long way in accelerating the iGPU, think Infinity Cache.

dgianstefani · Wednesday at 11:31 AM

Mack4285 said:
Isn't really fast dedicated/unified memory needed to make most of these APU:s?

Intel currently the only CPU maker to support CUDIMMs and next gen form factors like CAMM2.

ncrs · Wednesday at 1:47 PM

wNotyarD said:
That large slab of cache can go a long way in accelerating the iGPU, think Infinity Cache.

That depends on how it's connected.
For example in the chiplet version of Zen 4/5 iGPU there is no access to the L3 caches located on CCDs since the iGPU is on the IO die, and has its own small L2 cache with no L3 (Infinity Cache) going back to Infinity Fabric for RAM access.
In Intel Arrow Lake the iGPU doesn't have access to CPU L3 caches either since L3 is on the compute tile, the iGPU on its own tile with exclusive 4MB of L2 cache going back to the SoC tile with its RAM controllers.
As far as I know even monolithic AMD APUs do not share L3 between iGPU and CPU.
Cache coherency is a very complex problem, so I'm curious if Intel is going to share this new big cache available with the iGPU, but I remain sceptical. Maybe a dynamic partitioning of this huge cache between iGPU and CPU depending on the workload?

wNotyarD · Wednesday at 2:28 PM

ncrs said:
That depends on how it's connected.
For example in the chiplet version of Zen 4/5 iGPU there is no access to the L3 caches located on CCDs since the iGPU is on the IO die, and has its own small L2 cache with no L3 (Infinity Cache) going back to Infinity Fabric for RAM access.
In Intel Arrow Lake the iGPU doesn't have access to CPU L3 caches either since L3 is on the compute tile, the iGPU on its own tile with exclusive 4MB of L2 cache going back to the SoC tile with its RAM controllers.
As far as I know even monolithic AMD APUs do not share L3 between iGPU and CPU.
Cache coherency is a very complex problem, so I'm curious if Intel is going to share this new big cache available with the iGPU, but I remain sceptical. Maybe a dynamic partitioning of this huge cache between iGPU and CPU depending on the workload?

I understand your concern. But going by the article:

A separate cache-boosted passive tile could add over 100 MB of Last Level Cache (bLLC), which feeds both the CPU cores and the Celestial Xe3 iGPU

maybe Intel can pull a rabbit out of the hat?

Essaudio · Wednesday at 2:33 PM

If this is still dual channel DDR5 (8000 mt ) it’s going to be bandwidth limited. Strix halo uses double the lanes to feed its iGPU - more of a difference than cache would seem to be able to make up for. Maybe there’s some high bandwidth scenario here too?

tfp · Wednesday at 2:37 PM

Essaudio said:
If this is still dual channel DDR5 (8000 mt ) it’s going to be bandwidth limited. Strix halo uses double the lanes to feed its iGPU - more of a difference than cache would seem to be able to make up for. Maybe there’s some high bandwidth scenario here too?

I have been wondering about this too. Intel and AMD only have so long that they can sit on Dual channel if core/thread counts are going to increase wrappedly unless RAM is going to ramp just as quickly.

Hecate91 · Wednesday at 2:43 PM

Mack4285 said:
Isn't really fast dedicated/unified memory needed to make most of these APU:s?

The RAM would need to be soldered on the motherboard, CUDIMM or CAMM2 in dual channel is still a bandwidth limitation for the iGPU, and is why Strix Halo for example has unified memory.

ncrs · Wednesday at 3:26 PM

wNotyarD said:
maybe Intel can pull a rabbit out of the hat?

They did try twice before with two eDRAM-as-L4 designs on Broadwell and Skylake. Also the Ponte Vecchio compute accelerator had a giant cache in the base tile, but that turned out to be under-performing especially in terms of latency. So much that its successor got canceled.
AMD's 3D-VCache design is terrific (barely increased latency for significant size growth), let's hope Intel can match it in the client segment for the sake of competition. My concern is that Intel's design will be complex to manufacture so possibly more expensive as a result.

gwync · Wednesday at 3:42 PM

These sound super cool and im all on board for seeing where they go with this. But every time i read something about "look at our crazy flagship sku idea" and its a year or more out... i gotta really wait and see. I think this would hit the performance/portability middle ground a lot of people are looking for, and now with intels graphics drivers being pretty well matured i could see these doing really well.
Id like to see these in a desktop form factor, just a cracked out desktop APU that could be shoved into USFF systems (which will be 50% heatsink by volume).

BArms · Wednesday at 4:34 PM

What apps are actually using 50+ cores other than rendering/compiling? What is the point? Most of the time 1 to 8 cores will be fairly well loaded, and the rest will be wasted silicon. Is there some epeen award for laptops with the most wasted silicon I'm not aware of?

Sol_Badguy · Wednesday at 5:20 PM

BArms said:
What apps are actually using 50+ cores other than rendering/compiling? What is the point? Most of the time 1 to 8 cores will be fairly well loaded, and the rest will be wasted silicon. Is there some epeen award for laptops with the most wasted silicon I'm not aware of?

The point is probably the best efficiency in various workloads. That is ranging from idle, to single-threaded and then multi-threaded (moderate and heavy) workloads.

Having just one type of core means (very) good performance but likely poor efficiency, something like killing a fly with a sledgehammer.

But the operating system needs to be optimized in order to handle properly this hybrid arch, otherwise it will be a semi-flop.

Lew Zealand · Wednesday at 6:26 PM

ncrs said:
They did try twice before with two eDRAM-as-L4 designs on Broadwell and Skylake.

And Kaby Lake and Coffee Lake mobile, I have 3 NUCs with 64 and 128MB L4/eDRAM cache. Made a difference in iGPU performance.

igormp · Wednesday at 7:30 PM

Hecate91 said:
and is why Strix Halo for example has unified memory.

All APUs/CPUs with iGPUs have "unified memory".
Strix Halo difference from other x86 consumer CPUs is that it has a 256-bit bus instead of a 128-bit one.

BArms said:
What apps are actually using 50+ cores other than rendering/compiling? What is the point? Most of the time 1 to 8 cores will be fairly well loaded, and the rest will be wasted silicon. Is there some epeen award for laptops with the most wasted silicon I'm not aware of?

I guess then don't buy such high-end laptop if you're not doing any of such tasks that it could be useful?

wNotyarD · Wednesday at 7:38 PM

igormp said:
All APUs/CPUs with iGPUs have "unified memory".
Strix Halo difference from other x86 consumer CPUs is that it has a 256-bit bus instead of a 128-bit one.

If anything, Lunar Lake has memory packaged alongside the SoC. Strix Halo doesn't do the same, afaik.

igormp · Wednesday at 7:47 PM

wNotyarD said:
If anything, Lunar Lake has memory packaged alongside the SoC. Strix Halo doesn't do the same, afaik.

Where you place the memory is not really relevant to it being "unified" or not fwiw. It does help with power consumption, and it may help with getting higher clocks due to signal integrity, but it's not a direct way to get more performance.
In the end even a LNL or even an Apple Mx chip just end up with a higher frequency on the jedec standard, but nothing out of the ordinary.

wNotyarD · Wednesday at 8:01 PM

igormp said:
Where you place the memory is not really relevant to it being "unified" or not fwiw. It does help with power consumption, and it may help with getting higher clocks due to signal integrity, but it's not a direct way to get more performance.
In the end even a LNL or even an Apple Mx chip just end up with a higher frequency on the jedec standard, but nothing out of the ordinary.

Please, I wasn't disagreeing with you nor anything, just saying something which could cause some terminology confusion. Especially where marketing gets involved.

igormp · Wednesday at 8:07 PM

wNotyarD said:
Please, I wasn't disagreeing with you nor anything, just saying something which could cause some terminology confusion. Especially where marketing gets involved.

No worries, sorry if it came out as some sort of disagreeing, I was just adding extra info to your point as well.

Just to clarify what we both seem to be talking about, soldered/on-package != "unified".
And having it soldered on the PCB vs on-package has no direct performance implications.

eldon_magi · Wednesday at 8:41 PM

Sol_Badguy said:
The point is probably the best efficiency in various workloads. That is ranging from idle, to single-threaded and then multi-threaded (moderate and heavy) workloads.

Having just one type of core means (very) good performance but likely poor efficiency, something like killing a fly with a sledgehammer.

But the operating system needs to be optimized in order to handle properly this hybrid arch, otherwise it will be a semi-flop.

The operating system should be linux qemu/KVM utilizing host CPU, sr-iov, pcie passthrough, virtio, and CPU affinity
The guest operating systems are then multiples of windows/linux guests, hevc/av1 streaming..
Not one single OS

That's how you use all those cores

Punkenjoy · 2025-07-17T03:21:05+0100

a Last Level Cache similar to AMD Infinity Fabric doesn't work like usual L1/L2/L3 cache. It's cache attached to the memory controller and Memory operation are cached at this level. This mean a CPU core or a GPU core will first try to find the data in their own cache, and if it's not in cache, will do a normal memory operation to access the data. Then the memory controller will perform the cache lookup to see if it's in its cache.

This simplify cache topology but since it probably have higher latency than a usual large L3 cache like on Ryzen X3D. It will still probably be way better than nothing.

Visible Noise · 2025-07-17T04:10:48+0100

Punkenjoy said:
a Last Level Cache similar to AMD Infinity Fabric doesn't work like usual L1/L2/L3 cache. It's cache attached to the memory controller and Memory operation are cached at this level. This mean a CPU core or a GPU core will first try to find the data in their own cache, and if it's not in cache, will do a normal memory operation to access the data. Then the memory controller will perform the cache lookup to see if it's in its cache.

This simplify cache topology but since it probably have higher latency than a usual large L3 cache like on Ryzen X3D. It will still probably be way better than nothing.

Look up Broadwell. It had a 4th level cache that was blazing fast before the memory controller.

Processor	Ryzen 7800X3D
Motherboard	ROG STRIX B650E-F GAMING WIFI
Memory	2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s)	INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage	2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s)	42" LG C2 OLED, 27" ASUS PG279Q
Case	Thermaltake Core P5
Power Supply	Fractal Design Ion+ Platinum 760W
Mouse	Corsair Dark Core RGB Pro SE
Keyboard	Corsair K100 RGB
VR HMD	HTC Vive Cosmos

System Name	Under revision...
Processor	Same
Motherboard	Same
Cooling	Overhaul pending
Memory	Same
Video Card(s)	Same
Storage	Same + others under way
Display(s)	Same
Case	Same
Audio Device(s)	Same + others under way
Power Supply	Same + additional equipment incoming
Mouse	Same
Keyboard	Same
Software	Windows 11 Pro 24H2
Benchmark Scores	Incoming

System Name	G-Station 2.0 "YGUAZU"
Processor	AMD Ryzen 7 5700X3D
Motherboard	Gigabyte X470 Aorus Gaming 7 WiFi
Cooling	Freezemod: Pump, Reservoir, 360mm Radiator, Fittings / Bykski: Blocks / Barrow: Meters
Memory	Asgard Bragi DDR4-3600CL14 2x16GB
Video Card(s)	Sapphire PULSE RX 7900 XTX
Storage	240GB Samsung 840 Evo, 1TB Asgard AN2, 2TB Hiksemi FUTURE-LITE, 320GB+1TB 7200RPM HDD
Display(s)	Samsung 34" Odyssey OLED G8
Case	Lian Li Lancool 216
Audio Device(s)	Astro A40 TR + MixAmp
Power Supply	Cougar GEX X2 1000W
Mouse	Razer Viper Ultimate
Keyboard	Razer Huntsman Elite (Red)
Software	Windows 11 Pro, Garuda Linux

System Name	Silent/X1 Yoga/S25U-1TB
Processor	Ryzen 9800X3D @ 5.4ghz AC 1.18 V, TG AM5 High Performance Heatspreader/1185 G7/Snapdragon 8 Elite
Motherboard	ASUS ROG Strix X870-I, chipset fans replaced with Noctua A14x25 G2
Cooling	Optimus Block, HWLabs Copper 240/40 x2, D5/Res, 4x Noctua A12x25, 1x A14G2, Conductonaut Extreme
Memory	64 GB Dominator Titanium White 6000 MT, 130 ns tRFC, active cooled, TG Putty Pro
Video Card(s)	RTX 3080 Ti Founders Edition, Conductonaut Extreme, 40 W/mK 3D Graphite pads, Corsair XG7 Waterblock
Storage	Intel Optane DC P1600X 118 GB, Samsung 990 Pro 2 TB
Display(s)	34" 240 Hz 3440x1440 34GS95Q LG MLA+ W-OLED, 31.5" 165 Hz 1440P NanoIPS Ultragear, MX900 dual VESA
Case	Sliger SM570 CNC Alu 13-Litre, 3D printed feet, TG Minuspad Extreme, LINKUP Ultra PCIe 4.0 x16 White
Audio Device(s)	Audeze Maxwell Ultraviolet w/upgrade pads & Leather LCD headband, Galaxy Buds 3 Pro, Razer Nommo Pro
Power Supply	SF1000 Plat, 13 A transparent custom cables, Sentinel Pro 1500 Online Double Conversion UPS w/Noctua
Mouse	Razer Viper V3 Pro 8 KHz Mercury White w/Pulsar Supergrip tape, Razer Atlas, Razer Strider Chroma
Keyboard	Wooting 60HE+ module, TOFU-R CNC Alu/Brass, SS Prismcaps W+Jellykey, LekkerL60 V2, TLabs Leath/Suede
Software	Windows 11 IoT Enterprise LTSC 24H2
Benchmark Scores	Legendary

System Name	G-Station 2.0 "YGUAZU"
Processor	AMD Ryzen 7 5700X3D
Motherboard	Gigabyte X470 Aorus Gaming 7 WiFi
Cooling	Freezemod: Pump, Reservoir, 360mm Radiator, Fittings / Bykski: Blocks / Barrow: Meters
Memory	Asgard Bragi DDR4-3600CL14 2x16GB
Video Card(s)	Sapphire PULSE RX 7900 XTX
Storage	240GB Samsung 840 Evo, 1TB Asgard AN2, 2TB Hiksemi FUTURE-LITE, 320GB+1TB 7200RPM HDD
Display(s)	Samsung 34" Odyssey OLED G8
Case	Lian Li Lancool 216
Audio Device(s)	Astro A40 TR + MixAmp
Power Supply	Cougar GEX X2 1000W
Mouse	Razer Viper Ultimate
Keyboard	Razer Huntsman Elite (Red)
Software	Windows 11 Pro, Garuda Linux

Intel "Nova Lake‑AX" APU Enhances iGPU Performance with Plenty of Xe3 Cores

AleksandarK

News Editor

londiste

AleksandarK

News Editor

Sol_Badguy

Mack4285

wNotyarD

dgianstefani

TPU Proofreader

ncrs

wNotyarD

Essaudio

New Member

tfp

Hecate91

ncrs

gwync

BArms

Sol_Badguy

Lew Zealand

igormp

wNotyarD

igormp

wNotyarD

igormp

eldon_magi

Punkenjoy

Visible Noise

System Name	Primary PC
Processor	i5 12400F
Motherboard	Asus H610M-CT D4
Cooling	ID Cooling Frostflow X 360
Memory	2x16gb G. Skill TridentZ Neo 3200mhz CL16
Video Card(s)	16gb Nvidia Titan V
Storage	Crucial T700 2tb NVME
Display(s)	Samsung S70A 32" 4k IPS
Case	Asus AP201
Audio Device(s)	Integrated
Power Supply	Superflower Leadex V Platinum Pro 850w
Mouse	Gwolves Hati HTM w/Pulsar Superglide XL
Keyboard	Filco Majestouch 3 w/MX Blues

System Name	Main
Processor	8700K
Motherboard	Maximus Hero X
Cooling	EVGA 280 CLC w/ Noctua silent fans
Memory	2x8GB 3600/16
Video Card(s)	EVGA 2080TI Hybrid

System Name	Gamey #2 / Office rescue
Processor	Ryzen 7 5800X3D / Core i5-7600
Motherboard	Asrock B450M P4 / Dell Q270
Cooling	IDCool SE-226-XT / Dell 65W
Memory	32GB 3200 CL16 / 32GB 2400 CL17
Video Card(s)	PNY 5070 / Pulse 6400
Storage	4TB Team MP34 / 1TB WD SN550
Display(s)	LG 32GK650F 1440p 144Hz VA
Case	Corsair 4000Air / Opti 7050 SFF
Power Supply	EVGA 650 G3 / Dell 240W

Processor	9950x \| 5950x
Motherboard	x670e ProArt\| B550 ProArt
Cooling	PA 120 SE \|Fuma 2
Memory	4x64GB Kingston CUDIMM @5200MHz \| 4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	Corsair RM1000e \| XPG Core Reactor 850W
Software	I use Arch btw

System Name	XPS, Lenovo and HP Laptops, HP Xeon Mobile Workstation, HP Servers, Dell Desktops
Processor	Everything from Turion to 13900kf
Motherboard	MSI - they own the OEM market
Cooling	Air on laptops, lots of air on servers, AIO on desktops
Memory	I think one of the laptops is 2GB, to 64GB on gamer, to 128GB on ZFS Filer
Video Card(s)	A pile up to my knee, with a RTX 4090 teetering on top
Storage	Rust in the closet, solid state everywhere else
Display(s)	Laptop crap, LG UltraGear of various vintages
Case	OEM and a 42U rack
Audio Device(s)	Headphones
Power Supply	Whole home UPS w/Generac Standby Generator
Software	ZFS, UniFi Network Application, Entra, AWS IoT Core, Splunk
Benchmark Scores	1.21 GigaBungholioMarks