AMD Ryzen 9 9950X3D and 9900X3D to Feature 3D V-cache on Both CCD Chiplets

RogueSix · Sep 27, 2024

Daven said:
I guess you’ve never heard of thread scheduling and how hit or miss it is. This solves that problem. Even better if clock speeds can also be higher.

It is quite the contrary to what you say. Scheduling will now be even more important to the point it becomes SUPER-DUPER-MEGA-EXTRA-important with cache on both CCDs. For this dual cache setup to work correctly, games/apps (via the scheduler) always need to request the cached data from the "correct" cache on the "correct" CCD or else you will suffer latencies from hell if/when data needs to be fetched from the cache across the CCDs because e.g. Core 3 requests data that was previously stored to the cache by Core 14 on the other CCD. Can't have a scenario like that. Ever.

So, both the scheduler and the CPU always need to "know" exactly "who" (which core) cached something (what) and where it was cached to avoid the dreaded inter-CCD and inter-cache latencies. This is definitely going to be a challenge and very complex on the level of correct scheduling and correct CCD assignment etc.

AMD does not exactly have the best track record when it comes to these scheduling and core assignment shenanigans so I would be quite surprised if they get this to work flawlessly out of the gate.
Personally, I have avoided multi CCD CPUs like the plague due to the Xbox GameBar and 'GameMode On' requirements (I have a PC and not a console, you muppets). It will be interesting to see if the GameBar requirement will be dropped now(?) since core parking will no longer be required.

We'll have to wait and see how well this is gonna work in practice. I would expect some growing pains, to say the least...

JWNoctis · Sep 27, 2024

What's the reliability of this info? Is this a last-minute change? Somehow I'm imagining their testing and validation teams working three shifts. :twitch:

None of it does anything for cross-CCD scheduling problems - Ninja'd @RogueSix

wNotyarD · Sep 27, 2024

persondb said:
Did they get around the clock speed regressions from 3D cache? That was essentially why they haven`t done dual 3D cache,

A lot of applications suffered more from the clock speed regression than the benefits of more cache.

I think AMD did indeed correct it, if they're able to promise overclocking on the new X3D SKUs.

AusWolf · Sep 27, 2024

RogueSix said:
It is quite the contrary to what you say. Scheduling will now be even more important to the point it becomes SUPER-DUPER-MEGA-EXTRA-important with cache on both CCDs. For this dual cache setup to work correctly, games/apps always need to request the cached data from the "correct" cache on the "correct" CCD or else you will suffer latencies from hell if/when data needs to be fetched from the cache across the CCDs because e.g. Core 3 requests data that was previously stored to the cache by Core 14 on the other CCD. Can't have a scenario like that. Ever.

So, both the scheduler and the CPU always need to "know" exactly "who" (which core) cached something (what) and where it was cached to avoid the dreaded inter-CCD and inter-cache latencies. This is definitely going to be a challenge and very complex on the level of correct scheduling and correct CCD assignment etc.

AMD does not exactly have the best track record when it comes to these scheduling and core assignment shenanigans so I would be quite surprised if they get this to work flawlessly out of the gate.
Personally, I have avoided multi CCD CPUs like the plague due to the Xbox GameBar and 'GameMode On' requirements (I have a PC and not a console, you muppets). It will be interesting to see if the GameBar requirement will be dropped now(?) since core parking will no longer be required.

We'll have to wait and see how well this is gonna work in practice. I would expect some growing pains, to say the least...

Why is it a challenge, though? Shouldn't it be as simple as assigning CCD 1 to any foreground program that requires 8 cores + 96 MB (edited) cache or less, while background tasks get CCD 2, and anything that needs more than the above is spread out across the two CCDs?

usiname · Sep 27, 2024

RogueSix said:
It is quite the contrary to what you say. Scheduling will now be even more important to the point it becomes SUPER-DUPER-MEGA-EXTRA-important with cache on both CCDs. For this dual cache setup to work correctly, games/apps always need to request the cached data from the "correct" cache on the "correct" CCD or else you will suffer latencies from hell if/when data needs to be fetched from the cache across the CCDs because e.g. Core 3 requests data that was previously stored to the cache by Core 14 on the other CCD. Can't have a scenario like that. Ever.

So, both the scheduler and the CPU always need to "know" exactly "who" (which core) cached something (what) and where it was cached to avoid the dreaded inter-CCD and inter-cache latencies. This is definitely going to be a challenge and very complex on the level of correct scheduling and correct CCD assignment etc.

AMD does not exactly have the best track record when it comes to these scheduling and core assignment shenanigans so I would be quite surprised if they get this to work flawlessly out of the gate.
Personally, I have avoided multi CCD CPUs like the plague due to the Xbox GameBar and 'GameMode On' requirements (I have a PC and not a console, you muppets). It will be interesting to see if the GameBar requirement will be dropped now(?) since core parking will no longer be required.

We'll have to wait and see how well this is gonna work in practice. I would expect some growing pains, to say the least...

If the cores can get data from the wrong L3 cache, shouldn't this problem exist now? The L3 is spread to two CCD even now

Makaveli · Sep 27, 2024

So I guess they worked around the issue of Dual CCD traffic killing gains like it did for the 5900X prototype that had v-cache on both.

So the question that reminds is clock speed affected or did they solve this problem also.

RogueSix · Sep 27, 2024

usiname said:
If the cores can get data from the wrong L3 cache, shouldn't this problem exist now? The L3 is spread to two CCD even now

No. Because currently the solution is outright core parking. One CCD (the one w/o the 3D cache) gets put to sleep when you are gaming on a multi CCD X3D CPU.

oxrufiioxo · Sep 27, 2024

phanbuey said:
I actually think with the inter-CCD latency at ~70ns this isn't far from the truth.

I'd still prefer this setup over the 7950X3D though you can always tie games to the stonger ccd either way and my guess is even when cores do jump ccd it won't be a large hit.... Oddly when I tie games to all 16 cores I see higher avg framerates than I do if I just tie it to the non cache ccd in games that don't behave so this should still be a bit better than that.

I'd still like both options available just becuase I want to see the actual difference not on games that do well on the current 7950X3D already but in games that don't behave without user intervention.

So I hope AMD releases both options with the single ccd option being slightly cheaper for academic reasons of course lol.

AnotherReader · Sep 27, 2024

Makaveli said:
So I guess they worked around the issue of Dual CCD traffic killing gains like it did for the 5900X prototype that had v-cache on both.

So the question that reminds is clock speed affected or did they solve this problem also.

I'm not sure if inter die requests were ever a factor. If they were, then EPYC X would suffer much more than a hypothetical 5950X3D with 192 MB of L3 cache.

SIGSEGV · Sep 27, 2024

Daven said:
My work is assigning me CAD and rendering work. So I guess I’m upgrading my 7700x to a 9950x3d on my gaming rig. Thread assignment won’t be an issue if both CCDs have 3D cache.

Adobe Dimensions and Solid works needs some fast PC specs. I have a 7900xt so GPU spec is already there.

Indeed, 3D V-Cache was initially intended to be developed for server/workstation environments (for both CCD).
I don't know IF I will sell my 9950x
Let's wait for it on January 2025 :laugh:

Darmok N Jalad · Sep 27, 2024

Old enough to remember when this amount of L3 was a LOT of system RAM to have--more than the PC I had in college about 25 years ago. These CPUs could theoretically run NT4 and all associated programs without any system RAM, provided running RAMless was even be possible. Just think of the FPS and load times I'd get in Quake II.

phanbuey · Sep 27, 2024

Darmok N Jalad said:
Old enough to remember when this amount of L3 was a LOT of system RAM to have--more than the PC I had in college about 25 years ago. These CPUs could theoretically run NT4 and all associated programs without any system RAM, provided running RAMless was even be possible. Just think of the FPS and load times I'd get in Quake II.

500hz true gaming.

Darmok N Jalad · Sep 27, 2024

phanbuey said:
500hz true gaming.

How'd you know? It was a great day when I went from a P166MMX to the K6-2 500!

AusWolf · Sep 27, 2024

AnotherReader said:
I'm not sure if inter die requests were ever a factor. If they were, then EPYC X would suffer much more than a hypothetical 5950X3D with 192 MB of L3 cache.

I always said that if inter-CCD communication is an issue, then you don't need more than 8-cores. And if you do, then the benefits of having more cores outweigh the detriment of inter-CCD latency anyway.

A Computer Guy · Sep 27, 2024

Yea but the glaring disappointment might ensue because are we still getting 1 Good CCD and 1 Meh CCD? I suppose core uniformity now is a big plus and the lower TDP required by X3D cache.

Octavean · Sep 27, 2024

Since I went with the RyZen 3950X and later the 7950X (when released) due to wanting / needing more cores I was willing to forgo the benefits of X3D. I should point out that I didn't want to deal with core parking. The 9950X3D seems like a no compromise part and will likely be priced aggressively. I suspect that the clock speeds will be slightly reduced and the stacking usually means slightly less heat tolerance but this is minor. Still probably wont be worth it to me to upgrade to this gen but AMD is setting a precedent that I can get behind. Next gen I'll likely wait for the 16 core 32 thread X3D offering, the successor to the 9950X3D,....

TumbleGeorge · Sep 27, 2024

More cache more misses. Not only more hits.

.

A Computer Guy · Sep 27, 2024

TumbleGeorge said:
More cache more misses. Not only more hits. .

In either case 100% hits are still being taken to the wallet or the mattress depending on where you are from. Prepare to fork out the cash for Dual X3D is my prediction.

ir_cow · Sep 27, 2024

I know what Im buying next

rv8000 · Sep 27, 2024

It’s probably just me, but I see this as a loss if X3D parts are frequency limited again. It would’ve been more beneficial if they had implemented some sort of hardware scheduler as opposed to just dropping in another 3D cache chiplet.

dgianstefani · Sep 27, 2024

rv8000 said:
It’s probably just me, but I see this as a loss if X3D parts are frequency limited again. It would’ve been more beneficial if they had implemented some sort of hardware scheduler as opposed to just dropping in another 3D cache chiplet.

Zen 6.

Carillon · Sep 27, 2024

Consoles have similar latency issues with their double CCX design, I'm sure that ways to mitigate this latency penalty already exist there.
If we are lucky, we could see them come to PC in 5 to 10 years.

ThomasK · Sep 27, 2024

usiname said:
Finally, now the people will see that the 3D cache on both dies is useless and will stop crying for this

There's pleny of benefits to be had from the increased L3 cache on all CCDs on consumer side, let alone on the data center, where it's already common.

Cloudflare switches to EPYC 9684X Genoa-X CPUs with 3D V-Cache — 145% faster than previous-gen Milan servers | Tom's Hardware (tomshardware.com)

AusWolf · Sep 27, 2024

rv8000 said:
It’s probably just me, but I see this as a loss if X3D parts are frequency limited again. It would’ve been more beneficial if they had implemented some sort of hardware scheduler as opposed to just dropping in another 3D cache chiplet.

It's the usual "X3D if you game, normal if you don't" narrative again. Personally, I don't mind. Those higher clocks don't give you that much more performance anyway - only more power consumed and heat.

Hodor · Sep 27, 2024

Darmok N Jalad said:
Old enough to remember when this amount of L3 was a LOT of system RAM to have--more than the PC I had in college about 25 years ago. These CPUs could theoretically run NT4 and all associated programs without any system RAM, provided running RAMless was even be possible. Just think of the FPS and load times I'd get in Quake II.

My 7950x3d gets 940fps on crusher.dm2

System Name	Kuro
Processor	AMD Ryzen 7 7800X3D@65W
Motherboard	MSI MAG B650 Tomahawk WiFi
Cooling	Thermalright Phantom Spirit 120 EVO
Memory	Corsair DDR5 6000C30 2x48GB (Hynix M)@6000 30-36-36-76 1.36V
Video Card(s)	PNY XLR8 RTX 4070 Ti SUPER 16G@200W
Storage	Crucial T500 2TB + WD Blue 8TB
Case	Lian Li LANCOOL 216
Power Supply	MSI MPG A850G
Software	Ubuntu 24.04 LTS + Windows 10 Home Build 19045
Benchmark Scores	17761 C23 Multi@65W

System Name	G-Station 2.0 "YGUAZU"
Processor	AMD Ryzen 7 5700X3D
Motherboard	Gigabyte X470 Aorus Gaming 7 WiFi
Cooling	Freezemod: Pump, Reservoir, 360mm Radiator, Fittings / Bykski: Blocks / Barrow: Meters
Memory	Asgard Bragi DDR4-3600CL14 2x16GB
Video Card(s)	Sapphire PULSE RX 7900 XTX
Storage	240GB Samsung 840 Evo, 1TB Asgard AN2, 2TB Hiksemi FUTURE-LITE, 320GB+1TB 7200RPM HDD
Display(s)	Samsung 34" Odyssey OLED G8
Case	Lian Li Lancool 216
Audio Device(s)	Astro A40 TR + MixAmp
Power Supply	Cougar GEX X2 1000W
Mouse	Razer Viper Ultimate
Keyboard	Razer Huntsman Elite (Red)
Software	Windows 11 Pro, Garuda Linux

System Name	My second and third PCs are Intel + Nvidia
Processor	AMD Ryzen 7 7800X3D @ 45 W TDP Eco Mode
Motherboard	MSi Pro B650M-A Wifi
Cooling	Noctua NH-D9L chromax.black
Memory	2x 24 GB Corsair Vengeance DDR5-6000 CL36
Video Card(s)	PowerColor Reaper Radeon RX 9070 XT
Storage	2 TB Corsair MP600 GS, 4 TB Seagate Barracuda
Display(s)	Dell S3422DWG 34" 1440 UW 144 Hz
Case	Corsair Crystal 280X
Audio Device(s)	Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply	750 W Seasonic Prime GX
Mouse	Logitech MX Master 2S
Keyboard	Logitech G413 SE
Software	Bazzite (Fedora Linux) KDE Plasma

Processor	Ryzen 5 7600X
Motherboard	ASRock B650M PG Riptide
Cooling	Noctua NH-D15
Memory	DDR5 6000Mhz CL28 32GB
Video Card(s)	Nvidia Geforce RTX 3070 Palit GamingPro OC
Storage	Corsair MP600 Force Series Gen.4 1TB

System Name	The Expanse
Processor	AMD Ryzen 7 9800X3D
Motherboard	Asus Prime X670E-Pro Wifi BIOS 3222 AGESA PI 1.2.0.3a
Cooling	Corsair H150i Elite LCD XT
Memory	64GB G.SKILL Trident Z5 Neo RGB DDR5 6000 CL 30-40-40-96 1T
Video Card(s)	XFX Radeon RX 7900 XTX Magnetic Air (25.6.1)
Storage	WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 20TB
Display(s)	LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case	Fractal Design Meshify S2
Audio Device(s)	Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply	Corsair AX850 Titanium
Mouse	Corsair Dark Core RGB SE
Keyboard	Corsair K100
Software	Windows 10 Pro x64 22H2
Benchmark Scores	https://valid.x86.fr/asijsu https://browser.geekbench.com/v6/cpu/11073923

System Name	His & Hers
Processor	R7 5800X/ R7 7950X3D Stock
Motherboard	X670E Aorus Pro X/ROG Crosshair VIII Hero
Cooling	Corsair h150 elite/ Corsair h115i Platinum
Memory	Trident Z5 Neo 6000/ 32 GB 3200 CL14 @3800 CL16 Team T Force Nighthawk
Video Card(s)	Evga FTW 3 Ultra 3080ti/ Gigabyte Gaming OC 4090
Storage	lots of SSD.
Display(s)	A whole bunch OLED, VA, IPS.....
Case	011 Dynamic XL/ Phanteks Evolv X
Audio Device(s)	Arctis Pro + gaming Dac/ Corsair sp 2500/ Logitech G560/Samsung Q990B
Power Supply	Seasonic Ultra Prime Titanium 1000w/850w
Mouse	Logitech G502 Lightspeed/ Logitech G Pro Hero.
Keyboard	Logitech - G915 LIGHTSPEED / Logitech G Pro

Processor	Ryzen 7 5700X
Motherboard	ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling	Noctua NH-C14S (two fans)
Memory	2x16GB DDR4 3200
Video Card(s)	Reference Vega 64
Storage	Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s)	Nixeus NX-EDG27, and Samsung S23A700
Case	Fractal Design R5
Power Supply	Seasonic PRIME TITANIUM 850W
Mouse	Logitech
VR HMD	Oculus Rift
Software	Windows 11 Pro, and Ubuntu 20.04

System Name	SIGSEGV
Processor	AMD Ryzen 9 9950X
Motherboard	MSI MEG ACE X670E
Cooling	Noctua NF-A14 IndustrialPPC Fan 3000RPM \| Arctic P14 MAX
Memory	Fury Beast 64 Gb CL30
Video Card(s)	TUF 4090 OC
Storage	1TB 7200/256 SSD PCIE \| ~ TB \| 970 Evo \| WD Black SN850X 2TB
Display(s)	27" /34"
Case	O11 EVO XL
Audio Device(s)	Realtek
Power Supply	FSP Hydro TI 1000
Mouse	g402
Keyboard	Leopold\|Ducky
Software	LinuxMint
Benchmark Scores	i dont care about scores

System Name	Mac mini
Processor	Apple M1 8C
Motherboard	Mac mini logic board
Cooling	Mac mini cooler
Memory	16GB
Video Card(s)	M1 GPU
Storage	512GB
Display(s)	ASUS Pro Art 27"
Case	Mac mini enclosure
Power Supply	Apple 150W

System Name	stress-less
Processor	9800X3D @ 5425 MHZ
Motherboard	MSI PRO B650M-A Wifi
Cooling	Thermalright Phantom Spirit EVO (Intake)
Memory	64GB DDR5 6200 1:1 CL32-36-36, FCLK 2067
Video Card(s)	RTX 4090 FE
Storage	2TB WD SN850, 4TB WD SN850X
Display(s)	Alienware 32" 4k 240hz OLED
Case	Jonsbo Z20
Audio Device(s)	Yes
Power Supply	Corsair SF750
Mouse	DeathadderV2 X Hyperspeed
Keyboard	65% HE Keyboard
Software	Windows 11
Benchmark Scores	They're pretty good, nothing crazy.

System Name	Still not a thread ripper but pretty good.
Processor	Ryzen 9 7950x, Thermal Grizzly AM5 Offset Mounting Kit, Thermal Grizzly Extreme Paste
Motherboard	ASRock B650 LiveMixer (BIOS/UEFI version P3.08, AGESA 1.2.0.2)
Cooling	EK-Quantum Velocity, EK-Quantum Reflection PC-O11, D5 PWM, EK-CoolStream PE 360, XSPC TX360
Memory	Micron DDR5-5600 ECC Unbuffered Memory (2 sticks, 64GB, MTC20C2085S1EC56BD1) + JONSBO NF-1
Video Card(s)	XFX Radeon RX 5700 & EK-Quantum Vector Radeon RX 5700 +XT & Backplate
Storage	Samsung 4TB 980 PRO, 2 x Optane 905p 1.5TB (striped), AMD Radeon RAMDisk
Display(s)	2 x 4K LG 27UL600-W (and HUANUO Dual Monitor Mount)
Case	Lian Li PC-O11 Dynamic Black (original model)
Audio Device(s)	Corsair Commander Pro for Fans, RGB, & Temp Sensors (x4)
Power Supply	Corsair RM750x
Mouse	Logitech M575
Keyboard	Corsair Strafe RGB MK.2
Software	Windows 10 Professional (64bit)
Benchmark Scores	RIP Ryzen 9 5950x, ASRock X570 Taichi (v1.06), 128GB Micron DDR4-3200 ECC UDIMM (18ASF4G72AZ-3G2F1)

Processor	RyZen R9 3950X
Motherboard	ASRock X570 Taichi
Cooling	Coolermaster Master Liquid ML240L RGB
Memory	64GB DDR4 3200 (4x16GB)
Video Card(s)	RTX 3050
Storage	Samsung 2TB SSD
Display(s)	Asus VE276Q, VE278Q and VK278Q triple 27” 1920x1080
Case	Zulman MS800
Audio Device(s)	On Board
Power Supply	Seasonic 650W
VR HMD	Oculus Rift, Oculus Quest V1, Oculus Quest 2
Software	Windows 11 64bit

System Name	Silent/X1 Yoga/S25U-1TB
Processor	Ryzen 9800X3D @ 5.4ghz AC 1.18 V, TG AM5 High Performance Heatspreader/1185 G7/Snapdragon 8 Elite
Motherboard	ASUS ROG Strix X870-I, chipset fans replaced with Noctua A14x25 G2
Cooling	Optimus Block, HWLabs Copper 240/40 x2, D5/Res, 4x Noctua A12x25, 1x A14G2, Conductonaut Extreme
Memory	64 GB Dominator Titanium White 6000 MT, 130 ns tRFC, active cooled, TG Putty Pro
Video Card(s)	RTX 3080 Ti Founders Edition, Conductonaut Extreme, 40 W/mK 3D Graphite pads, Corsair XG7 Waterblock
Storage	Intel Optane DC P1600X 118 GB, Samsung 990 Pro 2 TB
Display(s)	34" 240 Hz 3440x1440 34GS95Q LG MLA+ W-OLED, 31.5" 165 Hz 1440P NanoIPS Ultragear, MX900 dual VESA
Case	Sliger SM570 CNC Alu 13-Litre, 3D printed feet, TG Minuspad Extreme, LINKUP Ultra PCIe 4.0 x16 White
Audio Device(s)	Audeze Maxwell Ultraviolet w/upgrade pads & Leather LCD headband, Galaxy Buds 3 Pro, Razer Nommo Pro
Power Supply	SF1000 Plat, 13 A transparent custom cables, Sentinel Pro 1500 Online Double Conversion UPS w/Noctua
Mouse	Razer Viper V3 Pro 8 KHz Mercury White w/Pulsar Supergrip tape, Razer Atlas, Razer Strider Chroma
Keyboard	Wooting 60HE+ module, TOFU-R CNC Alu/Brass, SS Prismcaps W+Jellykey, LekkerL60 V2, TLabs Leath/Suede
Software	Windows 11 IoT Enterprise LTSC 24H2
Benchmark Scores	Legendary

System Name	"Lots of people name their swords. Lots of cunts."
Processor	R7 7800X3D
Motherboard	ASRock B650M PG Riptide
Cooling	Wraith Max + 2x Noctua Redux NF-P14r + 3x NF-P12r
Memory	2x 16GB ADATA XPG Lancer Blade DDR5-6000 C30
Video Card(s)	Sapphire Pulse RX 9070 XT
Storage	ADATA Legend 970 2TB PCIe 5.0
Display(s)	Dell 32" S3222DGM - 1440p 165Hz / P2422H 1080p 60Hz
Case	HYTE Y40
Audio Device(s)	Microsoft Xbox TLL-00008
Power Supply	Cooler Master MWE 750 V2
Mouse	Alienware AW320M
Keyboard	Alienware AW510K
Software	W11 Pro

AMD Ryzen 9 9950X3D and 9900X3D to Feature 3D V-cache on Both CCD Chiplets

TPU Proofreader