AMD Ryzen Embedded V3000 SoCs Based on 6nm Node, Zen 3 Microarchitecture

btarunr · Jun 22, 2021

AMD's next generation Ryzen Embedded V3000 system-on-chips aren't simply "Cezanne" dies sitting on BGA packages, but rather based on a brand new silicon, according to Patrick Schur, a reliable source with leaks. The die will be built on the more advanced 6 nm silicon fabrication node, whilst still being based on the current "Zen 3" microarchitecture. There are several things that set it apart from the APU silicon of the current-generation, making it more relevant for the applications the Ryzen Embedded processor family is originally built for.

Built in the FP7r2 BGA package, the V3000 silicon features an 8-core/16-thread CPU based on the "Zen 3" microarchitecture. There are also an integrated GPU based on the RDNA2 graphics architecture, with up to 12 CUs, a dual-channel DDR5 memory interface, a 20-lane PCI-Express 4.0 root complex, with up to 8 lanes put out for PEG; two USB4 ports, and two 10 GbE PHYs. AMD could design at least three SKUs based on this silicon, spanning TDP bands of 15-30 W and 35-54 W.

View at TechPowerUp Main Site

Mysteoa · Jun 22, 2021

I'm wondering if this APU is going to be available on AM5, if we don't see Zen4. Kind of a transition generation.

Rhein7 · Jun 22, 2021

So Zen 3, RDNA 2, PCIe 4.0, USB 4, and dual channel DDR5? That's kinda like the perfect version of the current Ryzen mobile chips.

owen10578 · Jun 22, 2021

Can we get this on laptops too please? lol

Punkenjoy · Jun 22, 2021

With 2 Dimm in dual(2x2x32) channels running at 6400MT, that is 102 GB/s of bandwidth. Zen3 8 core already perform well with half that bandwidth so that leave more than enough to the GPU. GPU won't bother of any increased latency the memory would give as they are already build to hide it.

That would be more bandwidth than a Geforce 1030. This should be good enough for light 1080 gaming and medium detail, if not more since RDNA2 GPU is more modern than Pascal. There are no rumours of Infinity cache, but still, RDNA2 could have more L1 and L2 depending on the final layout.

Can't wait to see the final bench. the desktop 5600G and 5700G already have impressive performance (for integrated graphic) with Vega and DDR4. This could be a game changer.

ixi · Jun 22, 2021

Yes please for desktop version. Have been waiting for too long to see rdna2 on APU...

entropic · Jun 22, 2021

i wonder how IF is coupled to ddr5 speed

thevoiceofreason · Jun 22, 2021

It's hardly possible to buy a V2000 board still

Valantar · Jun 22, 2021

Hm, that is definitely interesting. Hope most of this (except for the 10G PHYs I guess) carries over to next-gen mobile and desktop APUs. Would make for some very, very compelling products.

persondb · Jun 22, 2021

owen10578 said:
Can we get this on laptops too please? lol

That will be Rembrandt in laptops, I think.

Punkenjoy said:
With 2 Dimm in dual(2x2x32) channels running at 6400MT, that is 102 GB/s of bandwidth. Zen3 8 core already perform well with half that bandwidth so that leave more than enough to the GPU. GPU won't bother of any increased latency the memory would give as they are already build to hide it.

Would still be lower latency than GDDR, which can be a small benefit, though bandwidth is more important.

Punkenjoy said:
That would be more bandwidth than a Geforce 1030. This should be good enough for light 1080 gaming and medium detail, if not more since RDNA2 GPU is more modern than Pascal. There are no rumours of Infinity cache, but still, RDNA2 could have more L1 and L2 depending on the final layout.

RDNA1 adds a nee level of cache, a L1 that's shared in the Shader Arrays(10 CUs in 5700xt). This is kept the same in RDNA 2. The old L1 in Vega/GCN is now just L0.

For Rembrandt it appears like they are doing 2 Shaders Arrays of 6 CUs each(3 DCUs) and 2mb of L2, an increase of the 1mb in the Vega iGPUs. So a decent increase in the caches.

Nothing too big of a change though.

Would be really impressive if they could do like Apple and do a shared cache with CPU and iGPU( aka SLC/System level cache for Apple, though other blocks like neural stuff also use it), even more so if they do it with their vertical cache. It would be completely revolutionary in the mobile and integrated gpu space.
But that's just a pipe dream.

Valantar · Jun 22, 2021

persondb said:
Would be really impressive if they could do like Apple and do a shared cache with CPU and iGPU( aka SLC/System level cache for Apple, though other blocks like neural stuff also use it), even more so if they do it with their vertical cache. It would be completely revolutionary in the mobile and integrated gpu space.
But that's just a pipe dream.

That's what a mobile Infinity Cache would likely be, no? A large dedicated cache doesn't make much sense for a mobile iGPU considering die space costs, but if the CPU (and other blocks) could also share it? That would be easier to defend. There's a big question of size though, as even a 16MB cache would take up significant die area.

I don't think we're likely to see stacked cache in mobile yet though. That would necessitate a huge amount of "structural silicon" (i.e. spacers around the cache die) considering the size of APUs compared to a CCX, and would no doubt have some detrimental effects for already thermally constrained mobile designs.

persondb · Jun 22, 2021

Valantar said:
That's what a mobile Infinity Cache would likely be, no?

Maybe, it could just be a smaller L3 exclusive to the iGPU. Navi 24 is rumoured to have 16 mb of infinity cache, so a small 8 mb for an iGPU would be pretty helpful already.

Valantar said:
AA large dedicated cache doesn't make much sense for a mobile iGPU considering die space costs, but if the CPU (and other blocks) could also share it? That would be easier to defend. There's a big question of size though, as even a 16MB cache would take up significant die area.

It all depends on the node, type of SRAM choosen as well as AMD desire. If you look at Apple(again), you will see that they are a shit ton of cache. 16 mb SLC(system level cache, basically a system wide L3), 16 mb of L2(12 for performance and 4 for efficiency cores) and even enormous L1 caches(320kb for performance and 192kb for efficiency cores).

So, I think that AMD could easily add something like 8 mb as infinity cache. Yes, Apple is in 5nm, but SRAM doesn't scale as logic does, in addition they are going to 6nm anyway(and have a bigger die than Cezanne despite increased density).

Valantar said:
I don't think we're likely to see stacked cache in mobile yet though. That would necessitate a huge amount of "structural silicon" (i.e. spacers around the cache die) considering the size of APUs compared to a CCX, and would no doubt have some detrimental effects for already thermally constrained mobile designs.

Probably not since they grind those dies so that they end up with the same height, plus structural silicon might even help with spreading more of the heat.

In addition, could actually reduce power consumption as the part which consumes the most is the data transmission, while processing it is cheap. If you remember Broadwell, it was pretty good in energy consumption and it could even do stuff like turn off under some circumstances.

This would really help with mobile skus. But honestly, I doubt that AMD will do it in the next 2-3 years.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	Lenovo V14
Processor	Athlon Gold 3150U (2C4T 2.4 - 3.3 GHz)
Motherboard	Lenovo
Cooling	Laptop Cooling
Memory	4 GB onboard + 8 GB DDR4-2400 MHz Hynix
Video Card(s)	Onboard Radeon Graphics
Storage	256 GB WD NVMe SSD + 480 GB Sandisk SSD Plus SATA
Display(s)	Shitty 768p TN Panel
Case	Laptop Case
Audio Device(s)	Realtek Onboard
Power Supply	65W Laptop Charger
Mouse	Logitech Mini Mouse
Keyboard	Lenovo
Software	Windows 10 Pro
Benchmark Scores	As long as enough for animu

System Name	Nero Mini
Processor	AMD Ryzen 7 5800X 4.7GHz-4.9GHz
Motherboard	Gigabyte X570i Aorus Pro Wifi
Cooling	Noctua NH-D15S+3x Noctua IPPC 3K
Memory	Team Dark 3800MHz CL16 2x16GB 55ns
Video Card(s)	Palit RTX 2060 Super JS Shunt Mod 2130MHz/1925MHz + 2x Noctua 120mm IPPC 3K
Storage	Adata XPG Gammix S50 1TB
Display(s)	LG 27UD68W
Case	Lian-Li TU-150
Power Supply	Corsair SF750 Platinum
Software	Windows 10 Pro

Processor	Ryzen 5600X
Motherboard	MSI B450i
Cooling	CM MasterLiquid Lite 120
Memory	16GB Crucial Ballistix
Video Card(s)	EVGA 3060 Ti
Storage	Kingston A2000 NVMe
Display(s)	ViewSonic VX2758A-2K-PRO
Case	SilverStone SG13
Audio Device(s)	O2+ODAC
Power Supply	Corsair RMx 550W
Mouse	Mionix Castor
Keyboard	Keychron K7
Software	W10 Pro

System Name	Hotbox
Processor	AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard	ASRock Phantom Gaming B550 ITX/ax
Cooling	LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory	32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s)	PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage	2TB Adata SX8200 Pro
Display(s)	Dell U2711 main, AOC 24P2C secondary
Case	SSUPD Meshlicious
Audio Device(s)	Optoma Nuforce μDAC 3
Power Supply	Corsair SF750 Platinum
Mouse	Logitech G603
Keyboard	Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software	Windows 10 Pro

AMD Ryzen Embedded V3000 SoCs Based on 6nm Node, Zen 3 Microarchitecture

Editor & Senior Moderator