AMD Could Solve Memory Bottlenecks of its MCM CPUs by Disintegrating the Northbridge

bug · Nov 1, 2018

Vayra86 said:
So, all roads truly do lead to Rome, then.

More like many Romes will lead to one road instead

champsilva · Nov 1, 2018

Vayra86 said:
So, all roads truly do lead to Rome, then.

But don't forget about the great fire of Rome.

Aomine_Law · Nov 1, 2018

Wouldnt it be much better to just make the memory controller modular? just thinking out loud.

Im just saying this because im not sure if more then one memory controller is beneficial at all when you have a multi cpu setup...

I know... its a bit out of the box but yeah

bug · Nov 1, 2018

Aomine_Law said:
Wouldnt it be much better to just make the memory controller modular? just thinking out loud.

Im just saying this because im not sure if more then one memory controller is beneficial at all when you have a multi cpu setup...

What do you mean by "modular"?
For reference, checkout the last paragraph here for on overview of the current implementation: https://en.wikichip.org/wiki/amd/infinity_fabric

Aomine_Law · Nov 1, 2018

bug said:
What do you mean by "modular"?
For reference, checkout the last paragraph here for on overview of the current implementation: https://en.wikichip.org/wiki/amd/infinity_fabric

Ah... so memory controllers stack their perfomance/bandwidth.
Well... thought it may be a better idea to just combine the memory controllers into one big die. One you can upgrade, the same way as you can with cpu`s.

Zubasa · Nov 1, 2018

Aomine_Law said:
Ah... so memory controllers stack their perfomance/bandwidth.
Well... thought it may be a better idea to just combine the memory controllers into one big die. One you can upgrade, the same way as you can with cpu`s.

Once you run the traces out to the board / another socket, the latency goes through the roof.

bug · Nov 1, 2018

Aomine_Law said:
Ah... so memory controllers stack their perfomance/bandwidth.
Well... thought it may be a better idea to just combine the memory controllers into one big die. One you can upgrade, the same way as you can with cpu`s.

Well, you're on to something. That's how things worked before Athlon64 and Core: the memory controller was in the so called northbridge - a standalone chip sitting on the motherboard. While obviously a more flexible design, it turns out it doesn't cut it anymore in modern systems.

Btw, welcome to TPU

Aomine_Law · Nov 1, 2018

Zubasa said:
Once you run the traces out to the board / another socket, the latency goes through the roof.

bug said:
Well, you're on to something. That's how things worked before Athlon64 and Core: the memory controller was in the so called northbridge - a standalone chip sitting on the motherboard. While obviously a more flexible design, it turns out it doesn't cut it anymore in modern systems.

Btw, welcome to TPU

yeah i know... but wouldnt it be much easier to reserve PCI-e lanes this way.
Im not saying that this is the solution. its just that with thinking out of the box one might find new ways to improve their product.

and thanks bug

RH92 · Nov 1, 2018

For everyone here saying this is bad solution or that it will create more problems here is some educational material for you :

Interposers, Chiplets and...ButterDonuts?

I mean do you think the peoples working on these designs are ignorants or something ? Obviously this new design will resolve many problems !

HD64G · Nov 1, 2018

Imho, this type of connectivity between CCXs is only meant for the next EPYC and Threadripper. And for this type of usage it is excellent and ingenious indeed. For Desktop Ryzens my opinion is that they will just improve the already existing connectivity. It is more than enough. And with 8C/16T CCX, most Ryzens will have just one CCX which means no added latency from the IF.

bug · Nov 1, 2018

HD64G said:
Imho, this type of connectivity between CCXs is only meant for the next EPYC and Threadripper. And for this type of usage it is excellent and ingenious indeed. For Desktop Ryzens my opinion is that they will just improve the already existing connectivity. It is more than enough. And with 8C/16T CCX, most Ryzens will have just one CCX which means no added latency from the IF.

Ideally, AMD will want a design that scales across product lines. Otherwise they have to keep redesigning the CCX. But there's no telling which solution they'll choose.

HD64G · Nov 1, 2018

bug said:
Ideally, AMD will want a design that scales across product lines. Otherwise they have to keep redesigning the CCX. But there's no telling which solution they'll choose.

Since EPYC and TR is already made seperately than desktop Ryzen CPUs and they are making money from that, it is very viable to contintue doing that, especially when they will raise the game by adding many more cores and decreasing latency for the market sections those are needed most.

bug · Nov 1, 2018

HD64G said:
Since EPYC and TR is already made seperately than desktop Ryzen CPUs and they are making money from that, it is very viable to contintue doing that, especially when they will raise the game by adding many more cores and decreasing latency for the market sections those are needed most.

What do you mean "separately"? Aren't they all just the same CCXs in different layouts?

Aldain · Nov 1, 2018

Hmmmmm maybe this gives some MERIT to the HARDOCP foum post which details that zen2 has some "newish" IF implementation...

bug · Nov 1, 2018

Aldain said:
Hmmmmm maybe this gives some MERIT to the HARDOCP foum post which details that zen2 has some "newish" IF implementation...

Believe it or not, IF is the Achille's heel for Zen. It was bound to be reworked in future incarnations.

Steevo · Nov 1, 2018

Fabric solutions always create more problems than they solve once it becomes this complex, the ring bus approach may be simpler and offer more throughput and lower latentcy if they can get it wide or fast enough.

AMD brought most of this on themselves, technical issues with ZEN, bulldozer, and other designs and latency to cache and memory has never truly been solved for years and "add more cores" has always been the solution. They need to build a memory controller for a 8 core that can be expanded to these insane core and thread counts, where a little latency added to a server workload with custom aware of penalties software handling the threads can mask it.

efikkan · Nov 1, 2018

sergionography said:
But how does this effect minimum latency? Right now with the current approach there is a somewhat wide delta between min and max latency depending on which core is communicating with what. When an app is running locally on a ccx the latency is excellent, when both ccx's are needed then the latency slightly increases, and lastly when needing to connect to other chips on the module for one workload then latency maxes out. This central north bridge might lower that max latency and make the gap between min and max much smaller, however from a high level one can expect min latency to take a big hit and increase drastically.

I don't think core-to-core communication between threads is the problem, but rather memory and cache accesses. The impact is greater than just taking the extra jump through the other CCX, it also "borrows" memory bandwidth from that CCX, which can lead to additional bottlenecks.

Most applications are very sensitive to memory latency, so redesigning this approach in future Zen iterations seems like a very good idea. Keeping cache and memory controllers as efficient and low latency as possible is one of the keys to increasing IPC.

HD64G · Nov 1, 2018

bug said:
What do you mean "separately"? Aren't they all just the same CCXs in different layouts?

By separately I mean they have different packaging and layout maybe. And that's exactly the difference between the supposed new layout of the upcoming EPYC and a normal Ryzen if that stays the same in its layout aspect.

bug · Nov 1, 2018

HD64G said:
By separately I mean they have different packaging and layout maybe. And that's exactly the difference between the supposed new layout of the upcoming EPYC and a normal Ryzen if that stays the same in its layout aspect.

But you were suggesting different IF implementations between Ryzen and Epyc. That would not mean simply a different layout, but also different CCXs. Which, as I said, would add to the costs. Unless I misunderstood something.

WikiFM · Nov 1, 2018

Vya Domus said:
The cache needs to be low latency, therefore it has to be on the same die.

It's going to be less actually, on average.

If the communication between the cores is hampered as you say , how would that affect the single thread performance ? It's the exact opposite of what you are describing, leaving only the cores and cache on each die would allow for higher clocks and therfore higher single thread performance and higher performance in general.

There are 2 differents situations, first inter-core communications with cores in different dies will require a third die in between to communicate. Second, single threaded performance would be lower because the memory controller won't be on-die, that is why AMD implemented the new Dynamic Local Mode.

GorbazTheDragon · Nov 1, 2018

Separated I/O for Zen 2 has been in the leaks for at least a month already...

Intel's Epyc Battle, AMD heads to the Moon

Even in may there were already rumours of a similar idea being passed around at intel, however as we all know intel is very far behind on the whole MCM architecture and as such it will be at least a year before any of their offerings are even doing the rounds being sampled before their retail release.

This is the way forward for the high end CPU market and anyone who says it isn't is just impossibly deluded...

I hope to see AMD continue their competitive streak in the high end, they have set high targets but I am pretty sure they will be achieved. I also hope they put a bit more time into refining the 8 core and lower chips to be more competitive on the gaming side.

Vya Domus · Nov 1, 2018

WikiFM said:
Second, single threaded performance would be lower because the memory controller won't be on-die

Just a blanket statement, nobody has a clue if that's going to have any impact whatsoever. Chances are it wont, if the leaks are true.

WikiFM · Nov 2, 2018

Vya Domus said:
Just a blanket statement, nobody has a clue if that's going to have any impact whatsoever. Chances are it wont, if the leaks are true.

Even AMD knows, that is why the new Dynamic Local Mode https://community.amd.com/community...e-amd-ryzen-threadripper-wx-series-processors. You are in denial.

R0H1T · Nov 2, 2018

WikiFM said:
Even AMD knows, that is why the new Dynamic Local Mode https://community.amd.com/community...e-amd-ryzen-threadripper-wx-series-processors. You are in denial.

So you've passed judgement on an undisclosed (publicly) design based on some of your assumptions & TR2, what about waiting for evidence or results?

WikiFM · Nov 2, 2018

R0H1T said:
So you've passed judgement on an undisclosed (publicly) design based on some of your assumptions & TR2, what about waiting for evidence or results?

I agree, but that applies both sides of the way.

Processor	Intel i5-12600k
Motherboard	Asus H670 TUF
Cooling	Arctic Freezer 34
Memory	2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s)	EVGA GTX 1060 SC
Storage	500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s)	Dell U3219Q + HP ZR24w
Case	Raijintek Thetis
Audio Device(s)	Audioquest Dragonfly Red :D
Power Supply	Seasonic 620W M12
Mouse	Logitech G502 Proteus Core
Keyboard	G.Skill KM780R
Software	Arch Linux + Win10

System Name	Name ?! How about BADASS GAMING RIG!!
Processor	X5675 4.2Ghz
Motherboard	P6X58D-Premium
Cooling	NZXT Kraken X
Memory	Crucial BLT 12GB
Video Card(s)	ASUS GTX1080 Extreme Gaming
Storage	Samsung 840 pro 240GB and 860 Evo 1TB
Display(s)	Aorus FI27Q
Case	Cooler Master MC600P Tempered Glass
Power Supply	Be Quiet Pure Power 10 650W Gold
Mouse	Aoroza Pro Gaming Mouse Type G
Keyboard	Cobra II
Benchmark Scores	Can run FF REMAKE

Processor	Intel i5-12600k
Motherboard	Asus H670 TUF
Cooling	Arctic Freezer 34
Memory	2x16GB DDR4 3600 G.Skill Ripjaws V
Video Card(s)	EVGA GTX 1060 SC
Storage	500GB Samsung 970 EVO, 500GB Samsung 850 EVO, 1TB Crucial MX300 and 2TB Crucial MX500
Display(s)	Dell U3219Q + HP ZR24w
Case	Raijintek Thetis
Audio Device(s)	Audioquest Dragonfly Red :D
Power Supply	Seasonic 620W M12
Mouse	Logitech G502 Proteus Core
Keyboard	G.Skill KM780R
Software	Arch Linux + Win10

System Name	Name ?! How about BADASS GAMING RIG!!
Processor	X5675 4.2Ghz
Motherboard	P6X58D-Premium
Cooling	NZXT Kraken X
Memory	Crucial BLT 12GB
Video Card(s)	ASUS GTX1080 Extreme Gaming
Storage	Samsung 840 pro 240GB and 860 Evo 1TB
Display(s)	Aorus FI27Q
Case	Cooler Master MC600P Tempered Glass
Power Supply	Be Quiet Pure Power 10 650W Gold
Mouse	Aoroza Pro Gaming Mouse Type G
Keyboard	Cobra II
Benchmark Scores	Can run FF REMAKE

Processor	Core i7-12700k
Motherboard	Z690 Aero G D4
Cooling	Custom loop water, 3x 420 Rad
Video Card(s)	RX 7900 XTX Phantom Gaming
Storage	Plextor M10P 2TB
Display(s)	InnoCN 27M2V
Case	Thermaltake Level 20 XT
Audio Device(s)	Soundblaster AE-5 Plus
Power Supply	FSP Aurum PT 1200W
Software	Windows 11 Pro 64-bit

Processor	RYZEN 7 5800X3D
Motherboard	Aorus B-550I Pro AX
Cooling	HEATKILLER IV PRO , EKWB Vector FTW3 3080/3090 , Barrow res + Xylem DDC 4.2, SE 240 + Dabel 20b 240
Memory	Viper Steel 4000 PVS416G400C6K
Video Card(s)	EVGA 3080Ti FTW3
Storage	XPG SX8200 Pro 512 GB NVMe + Samsung 980 1TB
Display(s)	ROG Strix OLED XG27AQDMG
Case	NR 200
Power Supply	CORSAIR SF750
Mouse	Logitech G PRO
Keyboard	Meletrix Zoom 75 GT Silver
Software	Windows 11 22H2

Processor	AMD Ryzen 5 5600@80W
Motherboard	MSI B550 Tomahawk
Cooling	ZALMAN CNPS9X OPTIMA
Memory	2*8GB PATRIOT PVS416G400C9K@3733MT_C16
Video Card(s)	Sapphire Radeon RX 6750 XT Pulse 12GB
Storage	Sandisk SSD 128GB, Kingston A2000 NVMe 1TB, Samsung F1 1TB, WD Black 10TB
Display(s)	AOC 27G2U/BK IPS 144Hz
Case	SHARKOON M25-W 7.1 BLACK
Audio Device(s)	Realtek 7.1 onboard
Power Supply	Seasonic Core GC 500W
Mouse	Sharkoon SHARK Force Black
Keyboard	Trust GXT280
Software	Win 7 Ultimate 64bit/Win 10 pro 64bit/Manjaro Linux

System Name	Compy 386
Processor	7800X3D
Motherboard	Asus
Cooling	Air for now.....
Memory	64 GB DDR5 6400Mhz
Video Card(s)	7900XTX 310 Merc
Storage	Samsung 990 2TB, 2 SP 2TB SSDs, 24TB Enterprise drives
Display(s)	55" Samsung 4K HDR
Audio Device(s)	ATI HDMI
Mouse	Logitech MX518
Keyboard	Razer
Software	A lot.
Benchmark Scores	Its fast. Enough.

Processor	AMD Ryzen 9 5900X \|\|\| Intel Core i7-3930K
Motherboard	ASUS ProArt B550-CREATOR \|\|\| Asus P9X79 WS
Cooling	Noctua NH-U14S \|\|\| Be Quiet Pure Rock
Memory	Crucial 2 x 16 GB 3200 MHz \|\|\| Corsair 8 x 8 GB 1333 MHz
Video Card(s)	MSI GTX 1060 3GB \|\|\| MSI GTX 680 4GB
Storage	Samsung 970 PRO 512 GB + 1 TB \|\|\| Intel 545s 512 GB + 256 GB
Display(s)	Asus ROG Swift PG278QR 27" \|\|\| Eizo EV2416W 24"
Case	Fractal Design Define 7 XL x 2
Audio Device(s)	Cambridge Audio DacMagic Plus
Power Supply	Seasonic Focus PX-850 x 2
Mouse	Razer Abyssus
Keyboard	CM Storm QuickFire XT
Software	Ubuntu

System Name	N/A
Processor	Intel Core i5 3570
Motherboard	Gigabyte B75
Cooling	Coolermaster Hyper TX3
Memory	12 GB DDR3 1600
Video Card(s)	MSI Gaming Z RTX 2060
Storage	SSD
Display(s)	Samsung 4K HDR 60 Hz TV
Case	Eagle Warrior Gaming
Audio Device(s)	N/A
Power Supply	Coolermaster Elite 460W
Mouse	Vorago KM500
Keyboard	Vorago KM500
Software	Windows 10
Benchmark Scores	N/A

System Name	Indis the Fair (cursed edition)
Processor	11900k 5.1/4.9 undervolted.
Motherboard	MSI Z590 Unify-X
Cooling	Heatkiller VI Pro, VPP755 V.3, XSPC TX360 slim radiator, 3xA12x25, 4x Arctic P14 case fans
Memory	G.Skill Ripjaws V 2x16GB 4000 16-19-19 (b-die@3600 14-14-14 1.45v)
Video Card(s)	EVGA 2080 Super Hybrid (T30-120 fan)
Storage	970EVO 1TB, 660p 1TB, WD Blue 3D 1TB, Sandisk Ultra 3D 2TB
Display(s)	BenQ XL2546K, Dell P2417H
Case	FD Define 7
Audio Device(s)	DT770 Pro, Topping A50, Focusrite Scarlett 2i2, Røde VXLR+, Modmic 5
Power Supply	Seasonic 860w Platinum
Mouse	Razer Viper Mini, Odin Infinity mousepad
Keyboard	GMMK Fullsize v2 (Boba U4Ts)
Software	Win10 x64/Win7 x64/Ubuntu

System Name	Good enough
Processor	AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard	ASRock B650 Pro RS
Cooling	2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory	32GB - FURY Beast RGB 5600 Mhz
Video Card(s)	Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage	1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s)	LG UltraGear 32GN650-B + 4K Samsung TV
Case	Phanteks NV7
Power Supply	GPS-750C