12-channel DDR5 Memory Support Confirmed for Zen 4 EPYC CPUs by AMD

Valantar · Dec 14, 2021

user556 said:
I think the CPU/controller side implementations for separation into smaller independent 32-bit channel are still to come. For the moment, my hunch is there is a simplified pairing option that is being used to manage DRR5 DIMMs as effectively single 64-bit wide channels. Making them more functionally like DDR4 DIMMs. Hence the no-change in channel config in the CPU specs. One step at a time sort of thing.

Not at all. As @trsttte said above, 32-bit channels have been in use on PCs for several years already. Also, the DDR5 standard is based around each DIMM having two 32-bit channels. You literally cannot make a DDR5-compliant and compatible controller without making it handle individual 32-bit channels. As there will always be two of these per DIMM, and some DDR5 controllers will also be DDR4-compatible, there is little doubt that these controllers will be similar to combined DDR4/LPDDR4X controllers: One block capable of controlling either a single 64-bit (DDR4) channel, or two separate 32-bit (LPDDR4X or DDR5 depending on the chip in question) channels. This does not make them merged or anything like that - it means that it's a dual-mode controller that can combine its two 32-bit interfaces into a single 64-bit interface when needed. DDR5 does not support or make use of such a combined interface, but bases its specific modes of operations and performance characteristics (such as less actual latency vs. on-paper latency) on these channels being separate.

Open task manager in any LPDDR4X system, you'll find it saying "dual channel" despite having 4 32-bit channels. Why? Clarity of communication. Nothing else.

You really, really need to listen to what I've been saying all along: what is communicated to people, and what is technical reality, are not necessarily identical. There is no direct causal or indexical relation between words and what they signify - especially when we're stacking layers of abstraction like we are here. Often, technically inaccurate communication is better because it more effectively communicates the core characteristics of what is communicated. This is what we are seeing now. And that is what @trsttte said above: for anyone but us here, who actually know the bus width of RAM in PCs (which really takes some doing), this is a distinction without a difference. Whether it's 2 64-bit channels or 4 32-bit channels is immaterial - aggregate bandwidth is the most important value for comparison, both across and within generations of tech. And, as "channels" has been how this has been communicated since the dawn of DDR RAM and multi-channel systems, the only sensible approach to ensuring understandable communication is to abandon technical accuracy in service of communicative pragmatism. Thus, four separate 32-bit channels are then "dual channel", because they are equivalent to previous dual-channel setups. Calling them quad channels would lead people to expect "twice as much" than before; calling them 32-bit channels would cause people to ask "how many bits were my channels before this?".

Remember: even CPU spec sheets are not written to appeal to those seeking in-depth technical knowledge. They are simplifications made in order to communicate the capabilities and requirements of a component at a relatively high level of detail, but are by no means exhaustive.

user556 said:
Well, it either is or isn't quad.

And this shows exactly that you haven't been listening at all. "Dual channel" DDR5 is quad channel. It will always be. But it won't be called that, because nobody knows the channels are half as wide, making calling it quad channel wildly misleading. It is, therefore, in all communications "dual (equivalent to previous generations) channel".

user556 · Dec 15, 2021

It's no big deal to pair channels to make them effective singles. When that's done, it is then correct to define it as a single channel irrespective of width, ie: a pair of 64-bit channels could be paired to make a single 128-bit channel. And four 32-bit channels can be managed as a single 128-bit channel too.

Whether any controller does or not I wouldn't know. That was my opening question right back at the start.

Valantar · Dec 15, 2021

user556 said:
It's no big deal to pair channels to make them effective singles. When that's done, it is then correct to define it as a single channel irrespective of width, ie: a pair of 64-bit channels could be paired to make a single 128-bit channel. And four 32-bit channels can be managed as a single 128-bit channel too.

Whether any controller does or not I wouldn't know. That was my opening question right back at the start.

And what I have been saying all along is that this doesn't matter. What would "pairing" even mean, on a technical level? Given that these are dual-mode DDR4 (1x64-bit) / DDR5 (2x32-bit) controllers, the controller has one interface to the rest of the CPU (whether that's a ring bus or a mesh). Given that one controller controls two channels, in that sense they are intrinsically paired. That doesn't mean the channels aren't independent - if they weren't, that would make them incompatible with DDR5. IMO, as long as the channels operate somewhat independently, they must be described as independent on a technical level - but how you describe them in non-technical communication is determined by other factors than this. And in DDR5, they are independent - that's the whole point. But for communicating platform capabilities, treating two 32-bit channels as equivalent to the de-facto standard of single 64-bit channels (especially due to DDR5 putting two on each DIMM) and thus calling this "one channel" makes perfect sense, makes for easy mid-level understanding of system capabilities, and avoids confusion far more than it creates it. It is likely entirely possible for Intel/AMD/whoever to make independent 32-bit DDR5 controllers, but this doesn't make much sense when they will never be implementing single 32-bit channels as DIMMs have two - but if they did, two of those would still amount to one "channel" in most communications, as the important aspect is comparability to previous standards and not technical accuracy.

Selaya · Dec 15, 2021

Let's do what GPUs have been doing for decades - call them 128bit channels (previously dual), shan't we 3Head

Valantar · Dec 15, 2021

Selaya said:
Let's do what GPUs have been doing for decades - call them 128bit channels (previously dual), shan't we 3Head

Or just ditch the "channel" wording entirely, and talk about memory interface width.

Selaya · Dec 15, 2021

yeah you're right, it's called 128-bit-bus or something in the GPUland yeah? guess i was just meming away too fast, didnt think that part through

Flanker · Dec 15, 2021

Valantar said:
All industry speak about "channels" in DDR5 (outside of technical documentation) has been in the form of "analogous to DDR4 channels" i.e. one channel per DIMM/2 DIMMs, despite the DIMMs technically being dual channel themselves. Look at any Z690 spec sheet - they all say "dual channel" despite this being 2x2x32 rather than 2x64. It doesn't make much sense to speak of single 32-bit channels as channels once implemented given that they can never be separated out individually, but will always be paired. And, of course, there is no way they are reducing the number of memory channels on next-gen EPYC - servers love their bandwidth.

wh....what now I'm confused lol

Patriot · Dec 15, 2021

Flanker said:
wh....what now I'm confused lol

DDR5 split the traditional 64bit channels, to twin 32bit, to allow for dual ecc per dimm. Each dimm has 2 half-channels. Yeah it's confusing.

As to why we dont just call it a 128bit bus for dual channel 64bit or quad 32bit... would be because 128bit bus would require both channels to be filled...
Unless you want mobo boxes saying up to 128bit bus....

user556 · Dec 15, 2021

The distinction is each channel has independent ~~parallel~~ concurrent addressing. When paired the independence is not there, they act as a single channel then.

One channel can be writing and the other reading for example. Or reading two separate areas for two tasks. A four channel controller can then be transferring to/from four areas ~~in parallel~~ concurrently. So it's a function of the controller too, not just the DIMMs and mobo.

Valantar · Dec 15, 2021

user556 said:
The distinction is each channel has independent ~~parallel~~ concurrent addressing. When paired the independence is not there, they act as a single channel then.

One channel can be writing and the other reading for example. Or reading two separate areas for two tasks. A four channel controller can then be transferring to/from four areas ~~in parallel~~ concurrently. So it's a function of the controller too, not just the DIMMs and mobo.

DDR5 will always do this. Not allowing this will literally break with the DDR5 specification. In this way, the channels are always separate. That is what I've been saying all along. The (sub-/half-)channels are separate, but outside of technical documentation, clarity of communication far outweighs technical accuracy in importance, which leads to them being spoken of as if they were equivalent to previous channels (as "number of channels => peak possible bandwidth" is the ruling logic, and adding "channels used to be 64-bit, now they are 32-bit, but there are more of them" to that is nothing but confusing).

Patriot said:
DDR5 split the traditional 64bit channels, to twin 32bit, to allow for dual ecc per dimm. Each dimm has 2 half-channels. Yeah it's confusing.

It is. But the thing is, it's only confusing for those of us who know/care about RAM channel bus widths and know these numbers to begin with. For anyone else, keeping "channel" to mean "64-bit-ish data bus, regardless of actual number of channels" means nobody gets confused and you can do roughly like-for-like comparisons without in-depth knowledge.

Patriot said:
As to why we dont just call it a 128bit bus for dual channel 64bit or quad 32bit... would be because 128bit bus would require both channels to be filled...
Unless you want mobo boxes saying up to 128bit bus....

That's an interesting point - GPUs always have all channels populated, after all. Still, I don't see how 128-bit bus would be more misleading than "dual channel" - if there's just a single DIMM there (or two but installed wrong) then it's effectively single channel and 64-bit - neither is more right or wrong.

Flanker said:
wh....what now I'm confused lol

It is pretty confusing. Tl;dr:
DDR4 and previously: One DIMM has one 64-bit channel, platforms have X (always an integer) channels and nX DIMM slots.
DDR5: One DIMM has two 32-bit channels, platforms have Y (always a multiple of 2) channels and nY/2 DIMM slots (i.e. you can never have a single-channel DDR5 platform, at least using DIMMs, as they always have two channels.

As always, DIMM slots and channel count are mediated by the number of DIMMs per channel, which is typically limited to two for consumer platforms, but can be (a lot) more in servers. So, for example, with a 4-DIMM motherboard, with DDR4 and previous that board could have 1, 2, 3 or 4 channels (but most likely 2 with 2DPC), with DDR5 a four-slot board can have 2, 4, 6 or 8 channels (most likely 4) - but they're half as wide.

Because of this, and because hardly anyone knows how wide a RAM channel is to begin with (and having to explain this constantly will just add to the confusion), DDR5 "channels" are talked of as if it still had a single 64-bit channel like previous standards, despite this being technically wrong. It's just that much easier to communicate.

Thus, a "12-channel" DDR5 server platform, even as reported in a rumor for the technically inclined, is likely to actually have 24 32-bit channels, or 12 "2x32-bit" channel pairs. Which means that if you're used to current 8-channel DDR4 platforms, this is a 50% increase (before accounting for clock speed increases).

Punkenjoy · Dec 15, 2021

If i recall, the nvidia chipset for the Athlon XP could actually work in either 64 bit wide channel or 128 bit wide channel.

The main reason is you had single core/thread CPU that wouldn't benefits as much to have another channel to read and write. The single core would just queue or merge the command mostly do 1 thing at the same time with no problem

Today, CPU on AM5 have up to 32 thread that need to use 2 channel for all their read and write. With DDR5, they now have 4 to do that while keeping the same bus width. The ratio Thread/channel change from 16/1 to 8/1. This is to help for high memory load latency (and not benchmark like AIDA64 that are mostly just a synthetic benchmark that read various amount of data in serial).

I am not sure if that will greatly benefits for desktop but it will certainly help servers on high concurrency workload. At least for right now. We all know that it took years before games started to be multithreaded, but those days, most game are. It's quite possible that future workload on desktop will require all those channels.

Valantar is right, if you have to write a 64b line in memory, it will always go thru a single 32 bit channel and it will never be split across 2 channel whenever it's on the same DIMM realm/channel or on another one. It will first screw up the addressing but also, it would defeat the main purpose of splitting these channels in the first place. (Having more parallel way of doing read/write to reduce latency in highly threaded workload.)

As we can see, CPU get larger and larger cache and we might not be far away than we think of a desktop class CPU with 1+ GB of L3 cache. The cache really help with Read operation but it do not solve the problem that all data need to be written to memory and this need bandwidth. The data must be somewhere before it get evicted from the cache. Having more channel to offload those write plus having higher bandwidth help. They also added a bunch of other feature like same bank refresh to ensure that the memory can be available to write or read as much as possible.

People freak out a bit about the high latency of DDR5 but the truth is in 2-3 CPU generation, it wouldn't really matter since those CPU would have large cache anyway. Most of the data they will run will fit in cache.

People that designed the DDR5 standard were very smart and they were seeing already what the future would look like. They know that standard will last for half a decade at minimum.

Valantar · Dec 15, 2021

Punkenjoy said:
As we can see, CPU get larger and larger cache and we might not be far away than we think of a desktop class CPU with 1+ GB of L3 cache. The cache really help with Read operation but it do not solve the problem that all data need to be written to memory and this need bandwidth. The data must be somewhere before it get evicted from the cache. Having more channel to offload those write plus having higher bandwidth help. They also added a bunch of other feature like same bank refresh to ensure that the memory can be available to write or read as much as possible.

People freak out a bit about the high latency of DDR5 but the truth is in 2-3 CPU generation, it wouldn't really matter since those CPU would have large cache anyway. Most of the data they will run will fit in cache.

People that designed the DDR5 standard were very smart and they were seeing already what the future would look like. They know that standard will last for half a decade at minimum.

There have also been benchmarks (IIRC both der8auer and LTT did (very different) deep-dives on this) that demonstrate at least with some degree of clarity that DDR5 "overperforms" in terms of latency compared to a like-for-like spec sheet comparison with DDR4 - which is likely down to the splitting of channels, as you say. So while everyone was expecting a latency regression from DDR5, we've actually started out on rough parity when comparing low-to-midrange JEDEC DDR5 to even "high end" JEDEC DDR4-3200.

Punkenjoy · Dec 15, 2021

Valantar said:
There have also been benchmarks (IIRC both der8auer and LTT did (very different) deep-dives on this) that demonstrate at least with some degree of clarity that DDR5 "overperforms" in terms of latency compared to a like-for-like spec sheet comparison with DDR4 - which is likely down to the splitting of channels, as you say. So while everyone was expecting a latency regression from DDR5, we've actually started out on rough parity when comparing low-to-midrange JEDEC DDR5 to even "high end" JEDEC DDR4-3200.

More channel might help in heavily threaded test or test designed for taking that into account. But don't underestimate the newer way DDR5 refresh memory.

With DDR4, when a refresh occur, a whole bank group is inaccessible and those refresh take quite some time. With DDR5, instead of refreshing all bank, it have the ability to only refresh one of the bank in each bank group. A bank need to be idle (no read or write) for the duration of the refresh that can last 280+ns on a 16 GB stick and need to be run every 4 µs. A Same Bank Refresh can be run every 2 µs and will be much shorter. (130 ns on a 16 GB stick).

And most importantly, while 1 bank is being refresh, all the 3 other bank of each bank group can still be active and perform read and/or write operation. In All bank mode the DIMM is being paused until the refresh is done.

user556 · Dec 15, 2021

It's not specific to a particular DIMM type. Any memory, including caches, can be configured into multiple channels. It's how the controller is built that makes up what is a channel.

So, when Intel says two total, Intel is likely speaking correctly because that's how that particular controller functions.

As for not always wiring up one channel (Historical AthlonXP example above). That's also true for today. All multichannel controllers can have unconnected channels.

PS: On that caches thing, it's highly likely there is caching schemes that use channels in place of multi-porting. Multi-ported SRAM is very bulky (right at the cell level) so only used for the small L1 caching. Although, in the case of Apple's M1 cores having 128 kB for each data cache (combined with the extreme number of execution units), they may well have opted for a multi-channel solution there instead.

System Name	Hotbox
Processor	AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard	ASRock Phantom Gaming B550 ITX/ax
Cooling	LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory	32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s)	PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage	2TB Adata SX8200 Pro
Display(s)	Dell U2711 main, AOC 24P2C secondary
Case	SSUPD Meshlicious
Audio Device(s)	Optoma Nuforce μDAC 3
Power Supply	Corsair SF750 Platinum
Mouse	Logitech G603
Keyboard	Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software	Windows 10 Pro

System Name	Hotbox
Processor	AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard	ASRock Phantom Gaming B550 ITX/ax
Cooling	LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory	32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s)	PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage	2TB Adata SX8200 Pro
Display(s)	Dell U2711 main, AOC 24P2C secondary
Case	SSUPD Meshlicious
Audio Device(s)	Optoma Nuforce μDAC 3
Power Supply	Corsair SF750 Platinum
Mouse	Logitech G603
Keyboard	Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software	Windows 10 Pro

System Name	Hotbox
Processor	AMD Ryzen 7 5800X, 110/95/110, PBO +150Mhz, CO -7,-7,-20(x6),
Motherboard	ASRock Phantom Gaming B550 ITX/ax
Cooling	LOBO + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory	32GB G.Skill FlareX 3200c14 @3800c15
Video Card(s)	PowerColor Radeon 6900XT Liquid Devil Ultimate, UC@2250MHz max @~200W
Storage	2TB Adata SX8200 Pro
Display(s)	Dell U2711 main, AOC 24P2C secondary
Case	SSUPD Meshlicious
Audio Device(s)	Optoma Nuforce μDAC 3
Power Supply	Corsair SF750 Platinum
Mouse	Logitech G603
Keyboard	Keychron K3/Cooler Master MasterKeys Pro M w/DSA profile caps
Software	Windows 10 Pro

Processor	Intel Core i5 8400
Motherboard	Gigabyte Z370N-Wifi
Cooling	Silverstone AR05
Memory	Micron Crucial 16GB DDR4-2400
Video Card(s)	Gigabyte GTX1080 G1 Gaming 8G
Storage	Micron Crucial MX300 275GB
Display(s)	Dell U2415
Case	Silverstone RVZ02B
Power Supply	Silverstone SSR-SX550
Keyboard	Ducky One Red Switch
Software	Windows 10 Pro 1909

System Name	[H]arbringer
Processor	4x 61XX ES @3.5Ghz (48cores)
Motherboard	SM GL
Cooling	3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory	16x gskill DDR3 1600 cas6 2gb
Video Card(s)	blah bigadv folder no gfx needed
Storage	32GB Sammy SSD
Display(s)	headless
Case	Xigmatek Elysium (whats left of it)
Audio Device(s)	yawn
Power Supply	Antec 1200w HCP
Software	Ubuntu 10.10
Benchmark Scores	http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww