• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Workstation DDR4 memory benchmarks: ECC vs. non-ECC, 16 GB vs. 32 GB, single vs. dual vs. quad channel, overclocked vs. default timings

Joined
Feb 22, 2009
Messages
762 (0.14/day)
System Name Lenovo 17IMH05H
Processor Core i7 10750H
Video Card(s) GTX 1660 Ti
Audio Device(s) SSL2
Software Windows 10 Pro 22H2
Benchmark Scores i've got a shitload of them in 15 years of TPU membership
It's been quite a while since i posted some of my last benchmarks. It's so nice not only to post random things here in TPU from time to time, but to return with something really BIG. This also marks my 10th anniversary both here in TPU and since i first started bench-marking generally. Grab some pop corn, because these RAM benchmarks will be long and boring... ;D

Since i am sitting on Intel X99, i thought it would be a nice idea to present all the possible memory performance benefits for this platform in different configurations, as most of the ''revelations'' ''revealed'' here will be also ''relevant'' for Intel X299 systems, as well as for X79 systems, as they both support quad channel and ECC memory. For this memory comparison specifically i am venturing in to the world of workstation and productivity tasks. Folks who work with video converting, financial, mathematic and office calculations, compiling, archiving, 3D building, rendering and product & technological design will find their needed memory sweet spot.

Best performance with the least cash and time investment is the key aspect in any process design, same criteria applies for the high end Intel workstation tasks.

OBJECTIVES


1. Do you need ECC memory or just standard memory? ECC memory is not needed for workstations, but if the price and performance is adequate, then it is highly recommended as it will increase system stability and make the system more fault proof. I will test the difference in performance between ECC DDR4 and non-ECC DDR4 modules of the same speed, latencies and quantity, and leave the pricing to you.

2. How much benefit would quad channel memory bring over dual channel and single channel? There is an undeniable benefit in increased bandwidth, but not all programs will benefit. Thus the second task is to find out which workstation tasks benefit the most.

3. Quantity of needed memory is the most straightforward question you can imagine. I won't even bother with 8 GB as a starting point and i am sure you will agree. 16 GB vs. 32 GB will be the focus here, and while i also tested 48 GB of RAM, i can tell you straight away, that not a single tested program benefited going from 32 GB quad channel to 48 GB quad channel (1 % margin of error), therefore i won't bother posting 48 GB results just to save time. That does not mean, however, that 48 GB are not needed. Adobe After Effects is the prime example when usage exceeds 32 GB, and there are more examples, but not in my test.

4. Finally, for those who want to use non-ECC memory, the most important thing is overclocking. Aggressive memory timings vs. jedec standard might justify the pricing of faster RAM, and that definitely has to be checked.

My focus for this test is DDR4 ECC memory performance analysis (and comparison with non-ECC DDR4 memory), thus no Core i7 CPU would suffice here. I also did not want to test a CPU with hyper-threading, as virtual cores sometimes mess up thread priorities in some programs. To avoid that, a pure 10 core Haswell-EP Xeon E5-2663 V3 with 25 MB L3 cache (Genuine Intel CPU) will be the workhorse. All 10 cores work at 3.1 GHz in turbo mode. By CPU specification the supported memory is from 1600 MHz to 2133 MHz only, thus a 3000 MHz DDR4 kit, supported by X99 system, will still work at 2133 MHz max! That's why the only way to overclock memory is to lower the memory latencies! I found the multi-threaded performance of this Xeon E5-2663 V3 to be actually identical to the Core i5 8600K overclocked to 5.0 GHz, which i tested before, so you get the picture. Asrock X99 Extreme4 motherboard, BIOS version P3.80.

TESTED RAM CONFIGURATIONS (all memory is unbuffered and double sided)

1. Crucial 16 GB SINGLE CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
2. Crucial 16 GB DUAL CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
3. Crucial 32 GB DUAL CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
4. Crucial 32 GB QUAD CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
5. Crucial 48 GB QUAD CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300 (will not be included)
6. G.Skill 16 GB DUAL CHANNEL DDR4 non-ECC 2133 MHz CL15-15-15-36-300
7. G.Skill 16 GB DUAL CHANNEL DDR4 non-ECC 2133 MHz CL11-12-12-30-300 overclocked

All the tested programs as well as their temporary output files and product appearance paths are located on an Samsung 860 Pro 1 TB SATA SSD. Windows 10 v1903 installed. There have been tested synthetic, semi-synthetic and custom programs, including full Passmark 9 CPU and RAM separate tests, and full SPEC Workstation CPU separate tests.

VIDEO PRESENTATION

-----------------------------------------------------------------------------------------------------
Cinebench R20




The values have been rounded to the tenth, as the results were in the range from 2780 to 2820 points only, which is in the margin of error zone.

V-ray 4.10.6



With V-ray we see something interesting. In 5 runs for each tested config, the non-ECC value RAM seems to perform the worst, even though the timings and speed are the same. Some runs were as low as 8100... Even the overclocked non-ECC RAM seems here only catching up to DDR4 ECC.

Luxmark C++



C++ language compiling seems to benefit from non-ECC RAM with with the fastest timings, but any RAM config will do.

3DPM 2.1



A 3D movement algorithm simulation takes no preference in RAM config. Calculated in millions operations per second.

7-ZIP 19.0 compression at 16 GB RAM allocation



When it comes to file compression, everything matters - quantity, bandwidth, timings... Calculated in millions instructions per second.

7-ZIP 19.0 decompression at 16 GB RAM allocation



But it ain't so true in file decompression. Here, only timings play the most important role. Calculated in millions instructions per second.

Corona 1.3



Even single a channel 16 GB DIMM does not suffer rendering issues.

Blender benchmark 1.0 beta2 quick CPU



And the same goes for Blender. Granted, i've only tested the recently released beta quick CPU test, so i can not guarantee the same result for independent custom demos. However, and this is important, the 5 most popular demos: BMW, CLASSROOM, BARCELONA PAVILION, SPLASH, and COSMOS LAUNDROMAT DEMO did not show any significant difference between 16 GB and 32 GB RAM, only minor. However, i can assure you, that 3DS Max will significantly benefit from 32 GB RAM over 16 GB. This has been confirmed in my previous CPU benchmark, in which i did not talk about memory advantages, but i say it now, so that you take considerations.

Handbrake 1.2.2



For this test i am using a custom 8.04 GB, raw, avi, 3820X2160 resolution, video file, that i convert H.265, mkv, 8 bit, CFR, 25 FPS, at 64 Mp/s, at single pass, at auto encoder profile, at slower preset. With that being said, even though 4K conversion fully utilizes multiple cores, there is no difference in memory configs.

Y-cruncher 0.7.7.9501 11.8 GB RAM allocation Pi numbers multi-threaded



Well, whatever this Pi calculation deals with, it surely likes bandwidth. Difference between quad and dual channel is not that big, but single channel memory users would suffer greatly. Very interesting is the fact that even the overclocked non-ECC memory has got nothing on ECC memory, and overclocking timings bring very little performance boost. The test is very accurate and multiple repeats show the same result to the second.

Microsoft Excel Pro Plus 2016



Custom sheet calculation. All RAM configs actually calculate the bench within 0.5 second difference, with single channel DDR4 ECC RAM at near 51 s, and overclocked non-ECC RAM at near 50 s.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Before going into anything, my Xeon E5-2663 V3 at 3.2 GHz OC in Passmark CPUmark scored 15500 points, which is about the same result for stock Core i7 8700K. This is the reason why the general Passmark score SUCKS, as it absolutely reveals no difference between processors (Core i7 8700K is faster, but how would you know from Passmark?). Dissecting Passmark to separate tests is the only way to find what is what and this applies to memory too.

Passmark 9 CPU integer math



Looks ok so far.

Passmark 9 CPU prime numbers



These prime numbers have some ECC affinity or what? Whenever i see ECC RAM beating standard RAM i redo the test at least 10 times, but eventually get the same results. Whoever said that ECC RAM are inferior in performance to non-ECC RAM was so so wrong, as this is not the first time we see this bizzare occurrence. Some 20 tests in repeat for each RAM config were done to confirm this.

Passmark 9 CPU physics



A major win for quad channel memory and ECC RAM in general. Each retesting gives 5 to 10 % different resutls, but the best scores for each RAM config out of 20+ runs was checked.

Passmark 9 CPU floating point math



One of the most accurate tests in Passmark shows variation of less than 0.5 % with each retest.

Passmark 9 CPU extended instructions



The weirdest result ever. The 704 points for each ECC RAM config would repeat forever, and then go to like 703 for once... For non-ECC RAM it's the same way with 585 points at max. There is no difference between single and quad channel, no difference in quantity, no difference in memory timings, just this absolutely bizzare radical ECC vs. non-ECC RAM differentiation. Can anybody put some insight WTF is going on here?

Passmark 9 CPU encryption



Another stable test.

Passmark 9 CPU sorting



Go ahead and ''sort'' how the overclocked RAM lost to value RAM.

Passmark 9 RAM database operations



Previously we evaluated how different RAM configs would alter the CPU performance, now we look into the raw RAM performance itself.

Passmark 9 RAM read cached



Big performance jump due to improved memory timings in non-ECC RAM is noticed. And yes, single channel seems to be the fastest config???

Passmark 9 RAM read uncached



No mater how many times i tried to repeat the 32 GB RAM quad channel result, it would only be around 11000 points, while 32 GB RAM dual channel result would always stay at 12000 points. Perhaps very sensitive to latencies?

Passmark 9 RAM write



Writing is another thing. Single channel RAM suffers a huge performance penalty.

Passmark 9 RAM threaded



And now we have the absolutely best result for quad channel memory users.

Passmark 9 RAM latency



While the primary RAM timings are clearly defined, there is so much more to that. This is the proper test to see how really fast the RAM response is. So even though 32 GB quad channel RAM has the same timings as 32 GB dual channel RAM, quad channel config brings up a slightly increased latency.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

SPEC - standard performance evaluation corporation provides a wide variety of apps f to test the performance of the computer parts simulating the most popular and demanding third party programs in the world. I am using the SPEC Workstation benchmark, which has CPU, GPU and storage tests. I will be using all of the CPU tests, except, Z-ZIP, Handbrake, Blender and Luxmark, as those have been tested independently. This time around i won't comment on anything. These SPEC tests take an incredibly long amount of time to roll out, but they are quite accurate, with the exception of FSI, which i excluded, because the variations in results were too great.

SPEC Workstation CalculiX



SPEC Workstation WPCcfd



SPEC Workstation rodiniaCFD



SPEC Workstation lammps



SPEC Workstation namd



SPEC Workstation rodiniaLifeSci



SPEC Workstation Convolution



SPEC Workstation WWTF



SPEC Workstation Kirchhoff



SPEC Workstation poisson



SPEC Workstation srmp



SPEC Workstation octave



SPEC Workstation python36


----------------------------------------------------------------------------------------------------------------------


As i said earlier, the difference between 32 GB quad channel and 48 GB quad channel was too small to showcase anything.
In many tests DDR4 ECC managed to outperform the standard non-ECC DDR4 RAM. Out of interest i will make additional tests with a different 2X8 GB non-ECC DDR4 kit and will report if anything interesting.
In the future i will make productivity benchmarks 2.0, which will be the follow up to the popular test, done previously. This has been of the most brain-squashing benchmarks i have ever done here in TPU. Please don't demand ''do that and that''. Just be patient.

So now you know what you need. Even if you are still using a Xeon with X79 and DDR3, many things should nicely apply for your system. X299 users will additionally benefit from higher frequency DDR4.
 
Last edited:

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
40,435 (6.58/day)
Location
Republic of Texas (True Patriot)
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
Just remember ram channel doubling from single to dual and from dual to quad is maximum theoretical bandwidth, there is resistance, reactance and impedance in all conductors. No Conductor is 100% efficient.
 
Joined
Aug 18, 2017
Messages
341 (0.14/day)
I couldn't find what exact kind of memory is being tested.
Are both ECC and nonECC memory UDIMMs?
What motherboard do you use? Do you verify in any way that the error correction is working?

I've seen many people incorrectly say "ECC" when they meant server memory types (registered, load reduced...).
While ECC UDIMMs mostly work on typical PCs (except ECC itself not functional) , registered types don't.
 
Joined
Feb 22, 2009
Messages
762 (0.14/day)
System Name Lenovo 17IMH05H
Processor Core i7 10750H
Video Card(s) GTX 1660 Ti
Audio Device(s) SSL2
Software Windows 10 Pro 22H2
Benchmark Scores i've got a shitload of them in 15 years of TPU membership
I couldn't find what exact kind of memory is being tested.
Are both ECC and nonECC memory UDIMMs?
What motherboard do you use? Do you verify in any way that the error correction is working?

I've seen many people incorrectly say "ECC" when they meant server memory types (registered, load reduced...).
While ECC UDIMMs mostly work on typical PCs (except ECC itself not functional) , registered types don't.

Actually, great question about ECC. Indeed folks how do you know if ECC is working or not? Because if the CPU does not support ECC, the RAM will work in non-ECC mode. So this the way you check:



These codes imply that ECC is working correctly. First - numbers 6 imply that it is multi channel ECC. Second - If DataWidth had been 64 bit instead of 72 bit, it would mean non-ECC.

Also, there was so much stuff to write, that i forgot to fill in some of the details, i've edited that now in the tested ram configurations section. All memory is unbuffered (UDIMM). With that being said, does registered RAM even work on X99? Anyone know? Or is it RAM manufacturer specific and sensitive?
 
Last edited:
Joined
Aug 18, 2017
Messages
341 (0.14/day)
Afaik some people managed to make reg ram work on x99, as long as the cpu is a xeon. One example validation here
 
Joined
Jul 25, 2006
Messages
12,147 (1.87/day)
Location
Nebraska, USA
System Name Brightworks Systems BWS-6 E-IV
Processor Intel Core i5-6600 @ 3.9GHz
Motherboard Gigabyte GA-Z170-HD3 Rev 1.0
Cooling Quality case, 2 x Fractal Design 140mm fans, stock CPU HSF
Memory 32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s) EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s) Samsung S24E650BW LED x 2
Case Fractal Design Define R4
Power Supply EVGA Supernova 550W G2 Gold
Mouse Logitech M190
Keyboard Microsoft Wireless Comfort 5050
Software W10 Pro 64-bit
Just remember ram channel doubling from single to dual and from dual to quad is maximum theoretical bandwidth
Right! With "theoretical" being the key word there. What we see in theory (and benchmarks) is rarely what we see in the real world. I would rather have more RAM in single channel than less RAM in dual or quad.
 
Joined
Feb 22, 2009
Messages
762 (0.14/day)
System Name Lenovo 17IMH05H
Processor Core i7 10750H
Video Card(s) GTX 1660 Ti
Audio Device(s) SSL2
Software Windows 10 Pro 22H2
Benchmark Scores i've got a shitload of them in 15 years of TPU membership
Thank's to sneaky, i've updated the thread with video presentation!

Honestly, i have not indulged myself what do these synthetic SPEC and PASSMARK tests represent in real life. You can see how quad channel RAM at the same size/speed/timings massively outperforms dual channel RAM in SPEC WWTF, SPEC rodiniaLifeSci, SPEC lammps, SPEC WPCcfd and Passmark RAM threaded apps, but what do these tests represent? Perhaps someone knows and someone might actually benefit from this... I feel like most people here would benefit from ''yet another gaming benchmark'', which are plenty in internet media, but honestly, i have not seen anyone do a deep dive into testing RAM for workstation apps like i did here. I myself only cared about Handbrake, and given the results i made the right choice to switch to mainsteam X370 Ryzen platform.
 
Joined
Feb 22, 2009
Messages
762 (0.14/day)
System Name Lenovo 17IMH05H
Processor Core i7 10750H
Video Card(s) GTX 1660 Ti
Audio Device(s) SSL2
Software Windows 10 Pro 22H2
Benchmark Scores i've got a shitload of them in 15 years of TPU membership
I've remade some tests with another kit of common DDR4: Kingston Hyperx 2X8 GB 2666 MHz at CL16 down-clocked to 2133 MHz CL15-15-15-36 to match the exact settings of the tested bed.

Amazingly, in the majority of the WS programs it was slightly to notably faster than my G.Skill kit at the same frequency and timings!

In other words 2X8 GB DDR4 2133 MHz CL15-15-15-36 can be faster than another kit of 2X8 GB DDR4 2133 MHz CL-15-15-15-36.

This does explain why my Crucial ECC RAM for whatever reason managed to outperform the non-ECC G.Skill RAM... The key here must be secondary timings, no other explanation is possible. I guess the G.Skill kit, originally targeted at 3600 MHz speed at CL19 just sucks when promped to 2133 MHz CL15, as it retains high secondary timings, which i did not play with, while the Kingston kit had lower secondary timigns by default. I never realized that secondary timings can impact performance so notably.
 
Last edited:
Top