• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

DDR5 CUDIMM Explained & Benched

Is JEDEC 6400 flat 52s really a very apples-to-apples comparison? Probably sky high tRFC too
 
Is JEDEC 6400 flat 52s really a very apples-to-apples comparison? Probably sky high tRFC too

I mean, works for the purpose of the article, I guess, which is showcasing the benefits and ease of use of a CUDIMM kit. UDIMMs at their prime are running 8000 to 8400 under ideal conditions (good CPU MC, 1 DPC motherboards), most enthusiast kits are running mid 7000's. Smaller gap vs. a 9600 CUDIMM, which is probably a good thing since this is currently restricted to the Core Ultra series 2 processor. Raptor Lake never received an update to support it (incl. likes of Z790 Apex), and AMD platforms currently cannot run it either.
 
Good to see solid gaming improvements with the CUDIMMs. The original TPU review of Arrow Lake had it using only DDR5 6000C36 which seems to have crippled its gaming performance.

I currently am running 265K with 8200C40 standard kit (overclocked from 6400C32). After changing some timings I got latency down to the low-mid 70s.

Standard DIMMs still seem pretty good if you can overclock, but I might end up getting a CUDIMM kit in the future if I want to reach higher speeds.
 
Is JEDEC 6400 flat 52s really a very apples-to-apples comparison? Probably sky high tRFC too
From a quick google of a 6400 corsair kit they have 32-40-40-84 timings, this should represent a better latency and overall improvement, comparing JEDEC timings to XMP enhanced timings is definitely not apples to apples and most people running 6400 DIMMS would be using either XMP/EXPO timings
 
Is JEDEC 6400 flat 52s really a very apples-to-apples comparison? Probably sky high tRFC too
I wouldn't put a lot of emphasis on the benchmarks for this article. It is merely to show that data bandwidth and performance increases over the highest frequency Intel officially supports for the Ultra 200 series (JEDEC 6400).

The main purpose of the article to to explain a bit about how CKD works and the benefits. It is truly amazing I can take a sub $200 B860 chipset motherboard and run 8400 MT/s with ease. Before CUDIMM, this required a ITX motherboard or the much more expensive 2-DIMM slot Intel Z790 motherboards like EVGA Dark, ASUS Apex or Gigabyte Tayhon.

What about gaming benchmarks at settings anyone with a 4090 will actually use ?
What do you mean by this?
 
Would be goo to retest the Core CPUs with Cudimm+updated firmware/driver.
If it supports cudimm, than it should be tested with cudimm.
 
I wouldn't put a lot of emphasis on the benchmarks for this article. It is merely to show that data bandwidth and performance increases over the highest frequency Intel officially supports for the Ultra 200 series (6400).
But that is what most people will be looking at, the benchmark numbers, and whilst 6400 is the max officially supported speeds for the Ultra 200 series, using XMP timings would definitely have given a more real-world analysis of the bandwidth/performance increase as I mentioned previously no one is going to be running 6400 DIMMS with basic jedec timings, are there any standard 6400 kits out there without XMP profiles using jedec timings only?
 
Is JEDEC 6400 flat 52s really a very apples-to-apples comparison? Probably sky high tRFC too
was going to comment on this this was hardly a apples to apples comparison, very sloppy:

1) Try at the same subtimings of the jedec mode in CKD mode
2) The CKD mode was running at a extremely high 1.45v yet the jedec was at 1.1v, why don't you run the CKD at the same 1.1 VDDR?
3) No attempt was made to see what clockspeed does the dimm reach with the same subtimings and vddr in JEDEC(bypass) mode.
4) No power consumption test were done either.

More than a bechmark this was a marketing piece, not impressed at all for TPU, this feels more like a LTT paid shill piece.
 
Great article and thanks for posting it.

1743451108498.png


this latency on the 6400 kit at 100.20ns is brutal though no memory sub timings were tweaked for this at all can probably get that much lower.
 
What about gaming benchmarks at settings anyone with a 4090 will actually use ?
Are you talking about how its all at 1080p and a 4090 should be run at 1440 minimum but more likely 4k? I believe it's been shown that it's shown that CPU and RAM are less impactful at 4k and the larger differences can be seen at 1080p hence what this review is doing. This is due to CPU and RAM bottlenecking. Isn't it more useful to see the increase in performance from one of the two existing bottlenecks?
 
unfortunately there isnt really much of a market (if any exist at all) for 9600 memory given that amd doesnt scale above like 6400 and intel's completely dead in the water ...
(and x3d giving basically fuckall about memory speed/timings)

oh well.
 
But that is what most people will be looking at, the benchmark numbers, and whilst 6400 is the max officially supported speeds for the Ultra 200 series, using XMP timings would definitely have given a more real-world analysis of the bandwidth/performance increase as I mentioned previously no one is going to be running 6400 DIMMS with basic jedec timings, are there any standard 6400 kits out there without XMP profiles using jedec timings only?
It's certainly what I looked at. This was also what I kinda expected the jump from DDR4 to DDR5 would be like on it's initial release.
 
unfortunately there isnt really much of a market (if any exist at all) for 9600 memory given that amd doesnt scale above like 6400 and intel's completely dead in the water ...
(and x3d giving basically fuckall about memory speed/timings)
Yeah, so far this is a cool technology in wait of a platform that is actually worth a damn. I suppose there’s the hope that whatever Intel has next is actually competitive. As it stands now, the results, however impressive, are completely irrelevant seeing as how most users would be better off with Zen 5 for whatever they need. Gaming? Yup. Productivity? Same.
 
I think people here are confusing what this article is about and what they want it to be.

It is NOT an Intel Ultra 200 Memory Scaling Article.

It IS an article about CUDIMM Technology and the potential benefits it brings to Intel Ultra 200 series currently and future platforms that will support it.
 
was going to comment on this this was hardly a apples to apples comparison, very sloppy:

1) Try at the same subtimings of the jedec mode in CKD mode
2) The CKD mode was running at a extremely high 1.45v yet the jedec was at 1.1v, why don't you run the CKD at the same 1.1 VDDR?
3) No attempt was made to see what clockspeed does the dimm reach with the same subtimings and vddr in JEDEC(bypass) mode.
4) No power consumption test were done either.

More than a bechmark this was a marketing piece, not impressed at all for TPU, this feels more like a LTT paid shill piece.

It's a technology showcase article, not a technical article. It's something aimed at letting people know this technology exists and how it can benefit them. It is of no use to enthusiasts or anyone with a technical level of understanding. But I'll just pitch in that JEDEC implies everything bone stock, that does mean crap timings, low volts, etc. - also 1.45v isn't "extremely high" for DDR5, it's pretty much the standard voltage range for any performance segment kit. The CUDIMM obviously cannot hit 9600 MT/s at 1.1v, there is no golden sample ever manufactured that will. Such 9600 MT/s kit in the hands of Z890 Apex owners are the kits that will be taken into the 10000 realm in an actually usable scenario for the first time ;)

Yeah, so far this is a cool technology in wait of a platform that is actually worth a damn. I suppose there’s the hope that whatever Intel has next is actually competitive. As it stands now, the results, however impressive, are completely irrelevant seeing as how most users would be better off with Zen 5 for whatever they need. Gaming? Yup. Productivity? Same.

Yeah, it is kind of sad that Raptor Lake is still pretty much the best Intel currently has to offer. Wouldn't place much stock in Panther Lake either... we'll really have to wait for Nova and the next socket for a shot at something impressive, but with Royal Core canned, I have a hard time guessing the future of Intel in any way...
 
unfortunately there isnt really much of a market (if any exist at all) for 9600 memory given that amd doesnt scale above like 6400 and intel's completely dead in the water ...
(and x3d giving basically fuckall about memory speed/timings)

oh well.

Testing the TeamGroup 8800MTs CUDIMM kit now in bypass mode on AM5 X670E Gene,.. I have two stable baseline profiles, will tighten timings, lower, adjust voltages next few days.

The CUDIMM kits work very well on AM5 in bypass mode.

20241212_195917935_iOS.jpg


6600CL30 Base first run TM5.png


8400 baseline TM5.png
 
Is there anyway to check what standard jedec timings are programmed into the SPD??
 
It's great that it enables faster memory on lower-end motherboards. However, does a kit with such terrible latencies even make sense for comparison? Nobody uses memory this bad... it only misleads the public.
The 6000MHz kit used in CPU reviews would do the trick. :p
 
I appreciate the topic but

Q: Is there a Latency penalty from the Client Clock Driver (CKD)?

A:
Technically, yes, but only on a nanosecond scale. Since the signal is redirected before reaching the DRAM ICs, there is inevitably an increase in latency. However, this increase is so minimal that it typically falls within the margin of error in benchmarking.

the datalines are going into a buffer logic ic which has a clock signal. I assume it is clocked with a CLK signal which has a frequency on all data input pins to the output pins.

therefore there is always a 100% - a penalty.

I doubt this has changed in the past 25 years. I only use my knowledge from microcontrollers and logic stones. logic stones are like - as also discussed today 7400 and similar stones.

the correct way should be to say - the penalty are additional clock cycles.

do we write CL 32 - 36 - 36 - 36 on the dram

or do we write now picoseconds - picoseconds - picoseconds - picoseconds whe we talk about the latency?

no we do not- we write those 32 - 36 - 36 - 36 numbers which are cycles per rising and falling edge frequency.

therefore it's misleading to claim its in the nanosecond scale - when we compare it with numbers these days.

CL 30 vs CL32 for 6000 MT/s DDR5 @ am5 are also in the picoseconds range. Fact.

--

there were endless pages which discussed how that 32 is converetd to picoseconds at least for the am4 processors. I will not go into details.

--

UDIMMS have a flaw in design. additional clock cycles.

--

If I were paid money from ADATA I would have written it somehow

CUDIMMS can reach as of now higher speeds with a little bit higher latency penalty and therefore in some workloads have better performance.

As no one paid me money for advertisements so there is no suggestion.


--

The aspect of higher Dimm Voltage is not addressed. I increased to 1.30V DC @6000 mt/s, the stock voltage is 1.25V DC for 5600MT/s hynix a-Die here.

Comparing 1.10V DC to 1.45V DC is sure a fair comparison in regards of energy consumption.

--

This gives me the shivers

I think my previous Am4 system and my current am5 system has something in the 60 nanoseconds range. Not something in 100 nanoseconds or 82 nanoseconds. Assuming it's the last number of the first row of teh AIDA benchmark

285k_aida_latency.png
 
Last edited:
This gives me the shivers

I think my previous Am4 system and my current am5 system has something in the 60 nanoseconds range. Not something in 100 nanoseconds or 82 nanoseconds. Assuming it's the last number of the first row of teh AIDA benchmark

285k_aida_latency.png
With my DDR5 6000 CL30 kit I see this with expo

1743456838045.png


Then with some Sub timing tweaks i'm at this currently.

1743456866580.png


So there is room to improve those scores.
 
Is there anyway to check what standard jedec timings are programmed into the SPD??
I can share this when I get home later tonight. Doesn't matter if its this UDIMM or CUDIMM, the JEDEC timings for each frequency tier will remain the same.
the datalines are going into a buffer logic ic which has a clock signal. I assume it is clocked with a CLK signal which has a frequency on all data input pins to the output pins.

therefore there is always a 100% - a penalty.

the correct way should be to say - the penalty are additional clock cycles.

do we write CL 32 - 36 - 36 - 36 on the dram

or do we write now picoseconds - picoseconds - picoseconds - picoseconds whe we talk about the latency?
I get what your saying, but that is a overly complicated way of explaining DRAM timings. I was directly referring to CKD adding latency, not all the DRAM timings. If I take a UDIMM 8000 and a CUDIMM 8000 kit and run them on the same platform, they have the same latency (in nano seconds) for AIDA64 test. One doesn't have the CKD chip at all, which is the best comparison you can make.

CUDIMM CKD - Disabled
CUDIMM CKD - Enabled
UDIMM

Same timings = same latency. This would suggest that having the CKD included does not increase latency enough to factor into performance penalties.

If I were paid money from ADATA I would have written it somehow

CUDIMMS can reach as of now higher speeds with a little bit higher latency penalty and therefore in some workloads have better performance.

As no one paid me money for advertisements so there is no suggestion.
I always like to provide examples when giving feedback, otherwise you are just complaining to complain. Doesn't help me make something better or anyone else understand what is wrong and how you could make it better :)
 
Back
Top