Thursday, August 19th 2021

Intel Xe HPG Graphics Architecture and Arc "Alchemist" GPU Detailed

It's happening, Intel is taking a very pointy stab at the AAA gaming graphics market, taking the fight to NVIDIA GeForce and AMD Radeon. The Arc "Alchemist" discrete GPU implements the Xe HPG (high performance gaming) graphics architecture, and offers full DirectX 12 Ultimate compatibility. It also offers contemporary features gamers want, such as XeSS, an AI-supersampling feature rivaling DLSS and FSR. There's a lot more to the Xe HPG architecture than being a simple a scale-up from the Xe LP-based iGPUs found in today's "Tiger Lake" processors.

Just like Compute Units on AMD GPUs, and Streaming Multiprocessors on NVIDIA, Intel designed a scalable hierarchical compute hardware structure for Xe HPG. It begins with the Xe-core, an indivisible compute building block that contains 16 each of 256-bit vector engines and 1024-bit matrix engines. combined with basic load/store hardware and an L1 cache. The vector unit here is interchangeable with the execution unit, and the Xe-core contains 16 of these. The Render Slice is a collective of four Xe-cores, four Raytracing Units; and other common fixed-function hardware that include the geometry pipeline, rasterization pipeline, samplers, and pixel-backends. The Raytracing Units contain fixed-function hardware for bounding-box intersection, ray traversal, and triangle intersection.
Moving a level up from the Render Slice, we see a Global Dispatch processor, and the GPU's memory fabric, which begins with an L2 cache. This is where Intel can scale up its GPUs. The 6 nm "Alchemist" silicon features eight Render Slices sharing the memory subsystem and Global Dispatch. Intel can carve out variants by toggling entire Render Slices, or perhaps even individual Xe-cores. With 16 EUs per Xe-core, 4 Xe-cores per Render Slice, and 8 Render Slices, we arrive at 512 execution units, or 4,096 programmable shaders.
Given that Xe HPG is being designed for the TSMC N6 (6 nm) silicon fabrication node, Intel claims a 50% performance/Watt gain over Xe LP solutions built on Intel's own 10 nm SuperFin nodes, such as the DG1 Iris Xe MAX. As a performance discrete GPU, "Alchemist" enjoys a much larger power budget, and hence operates at much higher frequencies for the available hardware.
Although not mentioned in the Intel presentation, it's been extensively reported that "Alchemist" (or DG2) features a 256-bit wide GDDR6 memory interface. The company is yet to determine memory size, but given the memory speeds available in the market (14 Gbps, 16 Gbps, and 18 Gbps), the memory bandwidth can end up anywhere between 448 GB/s to 576 GB/s.

Armed with as many as 512x 1024-bit Matrix cores backed by Xe Matrix extensions "Alchemist" is expected to be an AI processing powerhouse, with Intel leveraging them both for the XeSS performance enhancement feature, as well as other real-time rendering applications, such as de-noising for the raytracing pipeline.
Intel Arc "Alchemist" is expected to see a market release in Q1 2022. The company is ready with a roadmap with at least three of its successors, the Xe2 "Battlemage," Xe3 "Celestial," and XeNext "Druid." With no time-scale mentioned in the slide, we don't know if Intel is executing one architecture every year.
Add your own comment

14 Comments on Intel Xe HPG Graphics Architecture and Arc "Alchemist" GPU Detailed

#1
MentalAcetylide
I must be missing something. What all rendering engines would be able to take advantage of these cards? I know Iray is already off the table since that is specifically for NVidia branded cards.
Posted on Reply
#2
trsttte
out of topic but can you guys maybe do a summary of what's relevant from the recent Intel presentations? so many articles and so many parallel discussions, it's a bit hard to keep up
Posted on Reply
#3
Arkz
MentalAcetylideI must be missing something. What all rendering engines would be able to take advantage of these cards? I know Iray is already off the table since that is specifically for NVidia branded cards.
Anything that can use directx/open gl/vulkan I would assume.
Posted on Reply
#4
hardcore_gamer
Looking at the raw specs, an 8 slice design can theoretically perform rasterization close to an RTX 3070 / Radeon 6800. Hopefully, the drivers will be good enough.
Posted on Reply
#5
Tom Yum
Given that Xe HPG is being designed for the TSMC N6 (6 nm) silicon fabrication node, Intel claims a 50% performance/Watt gain over Xe LP solutions built on Intel's own 10 nm SuperFin nodes, such as the DG1 Iris Xe MAX.
But I thought Intel 10nm was the same as TSMC 7nm, and TSMC certainly isn't claiming the 6nm optical shrink gains anything like 50% performance/watt uplift. Have Intel and various fanboys been lying all this time?
Posted on Reply
#6
R-T-B
Tom YumBut I thought Intel 10nm was the same as TSMC 7nm, and TSMC certainly isn't claiming the 6nm optical shrink gains anything like 50% performance/watt uplift. Have Intel and various fanboys been lying all this time?
Yes and no.

You are discovering, unsurprisingly, that node names are incredibly misleading.

It is universal, sadly.
Posted on Reply
#7
Pixrazor
Shading Units: 4096
TMUs: 256
ROPs: 128
Compute units/SM Count/ Xe Cores: 32
RT Cores: 32
L1 Cache: ? KB
L2 Cache: ? MB
Memory Size: ? GB
Memory Type: GDDR6/X ?
Memory Bus: 256 bit ?
Bandwidth: 448 ~ 576 GB/s
-Spec wise, it's slightly less than a rx 6800 xt at 4608 vs 4096 shader (or 288 vs 256 TMUs)
-It will all depend on the gpu clocks the TSMC N6 will achieve, Intel is claiming 1.5x more than the Xe LP discrete; but we don't know 1.5x 1.1Ghz or 1.5Ghz...
-As for raytracing, it only has 32 RT cores, significantly less compared to 82 on Ampere and 80 on RDNA 2.0 but we don't know the performance of its RT core yet.
Posted on Reply
#8
ZoneDymo
R-T-BYes and no.

You are discovering, unsurprisingly, that node names are incredibly misleading.

It is universal, sadly.
In this case its not so much about node names as it is about claims made, as they said it was Intel themselves who claimed the transistor count of their 10nm is on par with TSMC 7nm hence the whole renaming scheme of "Intel 7"
But now its claimed to have a 50% uplift going for TSMC 6nm which....well TSMC never claimed would be the improvement from such a relatively minor shrink.
Posted on Reply
#9
R-T-B
Performance per watt has a lot to do with core design too. But it's all a bunch of marketing regardless until the chip is out.
Posted on Reply
#10
Splinterdog
Will Intel allow board partners such as Asus, Sapphire etc to use their GPU chips? (Aka AIBs, I believe)
Posted on Reply
#11
Upgrayedd
Wonder if they'll do anything like AMD did with their old APUs and 7770/7750 gpus.
Posted on Reply
#12
AusWolf
Such a clean architecture by the looks of it. Everything is some power of 2. The OCD in me likes it. Whether it will be any good in real life, we'll see.
Posted on Reply
#13
Vayra86
hardcore_gamerLooking at the raw specs, an 8 slice design can theoretically perform rasterization close to an RTX 3070 / Radeon 6800. Hopefully, the drivers will be good enough.
To be fair, that would be a fine performance level for Intel's debut. Enough for everyone, not hitting the top just yet.

But the real questions are:
- price
- form factors
- noise/heat/TDP
- feature set
Posted on Reply
#14
pavle
I hope their "HiZ" unit is good at hidden surface removal at or near NVIDIA level with their Gigapixel derived tech; if so the chip has lots of potential (it won't be drawing much of what won't ever be seen).
Posted on Reply
Add your own comment
Copyright © 2004-2021 www.techpowerup.com. All rights reserved.
All trademarks used are properties of their respective owners.