• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

DDR5 Thermal Testing & Analysis

So if comparison is the name of the game i´d like to see aftermarket heat spreaders tested to find the best / even tempered one.

Same goes for aftermarket ram-fans and ram watercooling solutions. With a sprinkle of simple/cheap DIY solutions if you don´t mind (zip ties and vhb tape ftw).

That prepares to test the 10 most bought sticks 3x ways to see how much they are beeing held back by their OEM cooling solution (hotspots anyone?).

For that you´d have to come up with an even more robust universal OC testing strategy / procedure. That includes finding out how to make meaningfull IR pictures.

I advise to cooperate with somebody since it would not hurt and it certainly would make the series even more interesting and the procedure, as well as the results, more well-grounded.


Thanks for this interesting article!
 
Last edited:
Very cool to see testing like this! I've been dabbling a little with my own DDR5 kit, but resetting the CMOS is a bit of a chore with my X670E Steel Legend (nature of the beast by going best bang for the buck for an X670E), so I keep around Buildzoid's AMD 7000 Series Hynix A (and M) Die "every CPU / kit can do this" settings. It did help quite a bit as it's a 96GB 6400 C32 G.Skill kit, and it's great to see how temperatures can have an effect should I decide to push further.
 
I wonder if Igor considered the reflection factor for optical thermal measurements.

I trust those thermal couples more as some thermal camera. Proven stuff. When the measurement electronics is a decent one, it should be very accurate to use thermal couples.

-- Very interesting article
 
I wonder if Igor considered the reflection factor for optical thermal measurements.
That was my thinking too, but my IR FLIR doesn't work anymore. I found out that once the battery died in those phone versions, it cannot be replaced or even powered through the phone.

So without checking myself, I left it out of the article.
 
That was my thinking too, but my IR FLIR doesn't work anymore. I found out that once the battery died in those phone versions, it cannot be replaced or even powered through the phone.

So without checking myself, I left it out of the article.
Not FLIR, but modern IR thermometers have a reflectivity adjustment. Project Farm on YT tested a bunch and it didn't make a difference worth a damn.
 
BartX Heatshields, BX2 block running chilled water.

DDR5 Chilled.png
 
anything over 1.4v adds a lot of stress, 1.35v seems the sweet spot.
 
ir_cow said:
Your feedback and suggestions will help guide the next steps in this ongoing investigation of DRAM temperature and the variables that go into optimizing system performance.
This is a great first revisit of an often neglected topic. While I'd +1 for thermal imaging, the important core coverage of temperature distribution across the DIMM and heatspreader, clock, voltage, and app effects is all here.

The main future suggestion I'd make is an open bench, custom loop test with a single 1R DIMM departs from common use. More on application, more useful to most readers data seems likely to come from testing in a typical ATX case (Lancool 207, 216, II, North, Torrent Compact, something like that) with denser DIMM configs such as 2x48 and 4x48 GB. Other variables I see regularly come up for DDR thermals are

1. CPU cooler: AIO with fanless block, AIO with a block fan, setback dual tower (Fuma 3, Royal Knight 120) or single tower with DIMMs in front of the fan, and dual tower with DIMMs under the front fan.
2. Crossflow: top intake fan cooling (if it's not a top exhaust AIO config) and potential for GPU passthrough heating.
3. Lighting: RGB on temperatures versus off versus the non-RGB version of the same DIMMs.

I like the CL and tREFI testing but it seems unclear from the current text what active cooler was used and what tRFC was set to. A related difficulty's stress tools (including also y-cruncher FFTv4 and Prime95 long) lack a benchmark component. So a common miss is all this work we do for stability and thermals rarely gets tied back to the question of whether it's actually worth it functionally as opposed to just for highmarking numbers. IMO y-cruncher timings or other memory intensive benches would be good data towards articulating a value proposition for CL24, 65+k tREFI, and such.

I'm not set up to probe low CL but FWIW it's been my experience extending DDR5's default 3.9 μs tREFI has little effect on real world compute throughput once tRFC's tightened. I'm looking mainly at runtime shifts in working apps that max out dual channel DDR for like eight hours solid. But y-cruncher picks up on this too.

ir_cow said:
With a limited sample size, it is unclear whether this behavior is exclusive to SK Hynix Rev A-Die, a flawed testing methodology or is an expected outcome.
This is interesting. I've pushed M-die tRFC to values low enough I've backed it off after black screens and OCCT errors but not to a clear breaking point and not in single variable testing where instability could be unambiguously attributed to tRFC. I need the rig up for probably somewhere in the range of 42-56 hours of compute this weekend but will try leaning on tRFC more if some slack time opens up.
 
My previous trident z neo dual ranks b-die needs 1.61v for 4600 mt/s but it'd require sub 33c (room temp 25c) to be fully stable which is not possible with watercooling that shared a heat with 10900kf and 3080 ti. So I decided to create a separate loop just for ram alone with a small ddc pump+tank combo and a single slim 120mm radiator. I think ddr5 is still sensitive to heat especially with max trefi and low trfc so I'll continue with this method for the upcoming 9800x3d build.
 
The main future suggestion I'd make is an open bench, custom loop test with a single 1R DIMM departs from common use.
Your saying move away from a single DIMM setup? This was done with a single DIMM to keep the variables limited.

More on application, more useful to most readers data seems likely to come from testing in a typical ATX case (Lancool 207, 216, II, North, Torrent Compact, something like that) with denser DIMM configs such as 2x48 and 4x48 GB. Other variables I see regularly come up for DDR thermals are

1. CPU cooler: AIO with fanless block, AIO with a block fan, setback dual tower (Fuma 3, Royal Knight 120) or single tower with DIMMs in front of the fan, and dual tower with DIMMs under the front fan.
2. Crossflow: top intake fan cooling (if it's not a top exhaust AIO config) and potential for GPU passthrough heating.
3. Lighting: RGB on temperatures versus off versus the non-RGB version of the same DIMMs.
Good ideas. #3 is the easiest. Case airflow is a complicated one though. Like I've gotten my memory to error out just with a Nvidia FE card before because it blows directly onto the memory.

I like the CL and tREFI testing but it seems unclear from the current text what active cooler was used and what tRFC was set to. A related difficulty's stress tools (including also y-cruncher FFTv4 and Prime95 long) lack a benchmark component. So a common miss is all this work we do for stability and thermals rarely gets tied back to the question of whether it's actually worth it functionally as opposed to just for highmarking numbers. IMO y-cruncher timings or other memory intensive benches would be good data towards articulating a value proposition for CL24, 65+k tREFI, and such.
Active cooling is just a fan - will update to mention that. I also don't see the point in using y-cruncher or prime95 over a strictly memory stress test. It yet another factor introduced by putting the CPU into the mix. It can also be offset by just lowering the CPU frequency, negating the "stress" if would add.

The tests in the article were designed / setup to explore the characteristics of the memory itself, not the platform it is used with. Partially why a lower frequency was primary used. Not pushing the limits of the IMC so if errors came out, it was a likely memory related problem. Still lots of things that can be explored like all the other secondaries. That is know changes based on the CPU and motherboard.

I'm not set up to probe low CL but FWIW it's been my experience extending DDR5's default 3.9 μs tREFI has little effect on real world compute throughput once tRFC's tightened. I'm looking mainly at runtime shifts in working apps that max out dual channel DDR for like eight hours solid. But y-cruncher picks up on this too.


This is interesting. I've pushed M-die tRFC to values low enough I've backed it off after black screens and OCCT errors but not to a clear breaking point and not in single variable testing where instability could be unambiguously attributed to tRFC. I need the rig up for probably somewhere in the range of 42-56 hours of compute this weekend but will try leaning on tRFC more if some slack time opens up.
I was at tRFC2 376 tRFCSB 270 for DDR5-5600. Could not trigger a error even at 1.6v. didn't seem to matter if it was 1.25v or higher, that was the lowest it would go to boot. Changing it in windows below this would instantly BSOD or freeze outright.

Still haven't fully explored other factors. But knowing lowest tRFC is tied to frequency, it can still be played with. Higher CAS needs less voltage and to extent frequency x CAS are linked together.

So inclusive. All I found out is at 376-270, that is the lowest it could be stable at for 5600 regardless of the voltage and corresponding CAS linked to the voltage. For two different DIMMs using this specific SK Hynix A die. Larger sample is needed to narrow down if this is abnormal.

UPDATE:
I had some nice in person feedback from a Data Analyst. He pointed out the names of my graphs are incorrectly labeled because its not titled based on X&Y. This does not affect the data and can still be read as is.

Secondly it was assumed that when the graph flatline at the end, this was understood that is was showing equilibrium, ie it will not rise further in temperature due to the thermal dissipation from heat spreader out performing the thermal output of the memory.

Starting temperature is not the reason why one frequency or voltage ends up above or below another. To prove this I will need to make another chart where the temperature starts out at 60+ by using a hair dryer and plot the decline to the same equilibrium as previously shown.

Both will be done after I return from vacation.
 
Last edited:
anything over 1.4v adds a lot of stress, 1.35v seems the sweet spot.
It's not voltage that contributes that much to dim temperatures, it's trefi and trfc. Basically these 2 affect how frequent (TREFI) and how long (TRFC) the timeouts are. By increasing trefi and lowering trfc you basically don't give the dims much time to cool down. Voltage has much less of an impact unless you start pushing something crazy like 1.6+ volts.
 
It's not voltage that contributes that much to dim temperatures, it's trefi and trfc.
Hmm this would be good to test to explore more. Though from the limited testing 65k, 132k and 256k tREFI all has similar temperature in my tests at 1.5v vs default 7k.

Same goes for when I tried 1.25v, that those 4 were hitting the same temperatures for 1.25v. this would point to that at least at 5600, what your saying about tREFI is not true and voltage is the driver of temp in this example.

What I'm seeing is higher vs low tREFi will error out once it passes a threshold. For example I could run sustain 256k at 1.25v, but not at any higher voltage - because the voltage is lower, this lower temp.

Cannot comment on tRFC for another week as I'm not home to check the raw data I didn't make graphs with.
 
Last edited:
Hmm this would be good to test to explore more. Though from the limited testing 65k, 132k and 256k tREFI all has similar temperature in my tests at 1.5v vs default 7k.

Same goes for when I tried 1.25v, that those 4 were hitting the same temperatures for 1.25v. this would point to that at least at 5600, what your saying about tREFI is not true and voltage is the driver of temp in this example.

What I'm seeing is higher vs low tREFi will error out once it passes a threshold. For example I could run sustain 256k at 1.25v, but not at any higher voltage.

Cannot comment on tRFC for another week as I'm not home to check the raw data I didn't make graphs with.
I would like to know more about it.
 
"In mid 2024 JEDEC finalized the DDR5-8800 standard"

Does JEDEC still use the 1.100V all the way to this frequency for the standard?
 
"In mid 2024 JEDEC finalized the DDR5-8800 standard"

Does JEDEC still use the 1.100V all the way to this frequency for the standard?
I believe so, but not certain. 8800 CL62 is quite high otherwise

 
Hmm this would be good to test to explore more. Though from the limited testing 65k, 132k and 256k tREFI all has similar temperature in my tests at 1.5v vs default 7k.

Same goes for when I tried 1.25v, that those 4 were hitting the same temperatures for 1.25v. this would point to that at least at 5600, what your saying about tREFI is not true and voltage is the driver of temp in this example.

What I'm seeing is higher vs low tREFi will error out once it passes a threshold. For example I could run sustain 256k at 1.25v, but not at any higher voltage - because the voltage is lower, this lower temp.

Cannot comment on tRFC for another week as I'm not home to check the raw data I didn't make graphs with.
That's because you went into diminishing returns territory. Try default TRFC (900 or whatever it is) and then do 10k vs 65k trefi. There Then add in a tightened TRFC with 65k trefi. There should be a huge increase in temperature. Preferably do all that without active cooling.
 
That's because you went into diminishing returns territory. Try default TRFC (900 or whatever it is) and then do 10k vs 65k trefi. There Then add in a tightened TRFC with 65k trefi. There should be a huge increase in temperature. Preferably do all that without active cooling.
It will be a good follow-up for sure. Though I'm willing to bet the results will be disappointing for one of us since we are at odd here.
 
Last edited:
1.5 volts? Damm, feels high for even DDR4.
 
I wonder if Igor considered the reflection factor for optical thermal measurements.

I trust those thermal couples more as some thermal camera. Proven stuff. When the measurement electronics is a decent one, it should be very accurate to use thermal couples.

-- Very interesting article
If you measure the black matte plastic surface of a chip package and choose an emissivity factor of 0.90, while ignoring reflections, how much wrong can you be? I don't know how to calculate an estimate but ~20°C error at just ~20°C above ambient seems huge here. The shiny PCB surface is more tricky and the heatsink even more so, but you can't use a formula to account for reflections; you must minimise them.

This article by FLIR says that most flat-finish paints have an emissivity around 0.90. Also, for higher emissivity objects, reflected temperature has less influence. For highly reflective surfaces (heatsink and probably PCB too) it advises to place a piece of black tape over the surface, then measure temperature at that point.

@ir_cow How did you attach the thermocouples to the surface? Did you use a goop of TIM or temporary glue? If you're using only tape to make the sensor touch the surface, I'd say it's insufficient.

"In mid 2024 JEDEC finalized the DDR5-8800 standard"

Does JEDEC still use the 1.100V all the way to this frequency for the standard?
I'd also say yes, because JEDEC is meant for serious and conservative stuff (read: servers). 8800 MT/s at 1.1 V will probably become possible with MRDIMM where the memory chips will operate at 4400 MT/s and only the multiplexer will work at full data rate.
 
You're saying move away from a single DIMM setup?
Depends what you want to cover. If the data I have is anything to go by, a single 1R DIMM'll be the coolest and thus easiest to highmark with. But if perf on memory-liking apps is important, dual DIMM's needed to utilize both DDR channels. And, if 2x48's not enough, then 4x32 or 4x48's necessary. I see the highest temperatures and least airflow response in 2DPC 2R, which is unsurprising as it's the densest config.

For any app, including memory stress, the CPU's in the mix. I understand wanting to minimize its effects but I'm not sure that's helpful to understanding thermal requirements for a build. I don't have an Arrow Lake to test on as yet but Intel's hitting ~120 GB/s of DDR bandwidth for perf broadly comparable to what Zen 5 does at ~70 GB/s. It seems plausible either departure from the 13900K's ~100 GB/s influences DIMM temperatures.

It will be a good follow-up for sure.
I'd suggest including auto refresh settings as a control on the tightened values as that's what folks doing EXPO/XMP or just putting up clocks and voltage are going to be running. For example, I tightened the 2x48GB M-die I'm working with from
  • 3.90 μs tREFI, tRFC-tRFC2-tRFCsb 1145-615-531 to
  • 5.85 μs, 480-288-244
The tRFC changes bench several percent higher. The longer tREFI barely increases bandwidth, hardly reduces latency, almost negligibly improves benchmarks, and lowers active power by 2%. It does reduce idle power by ~13% with the tightened tRFC-tRFC2-tRFCsb. I can't measure any difference putting tREFI above 5.85 μs, so there doesn't appear to be functional value in doing the cooling for 20+ μs.

Also, if I tighten tRFC-tRFC2-tRFCsb to 448-244-200 I can boot M-die to an instant BSOD in OCCT.
 
Interesting! I like that you tested tREFI and temp!
 
Back
Top