- Joined
- Feb 22, 2009
- Messages
- 786 (0.13/day)
Processor | Ryzen 7 5700X3D |
---|---|
Motherboard | Asrock B550 PG Velocita |
Cooling | Thermalright Silver Arrow 130 |
Memory | G.Skill 4000 MHz DDR4 32 GB |
Video Card(s) | XFX Radeon RX 7800XT 16 GB |
Storage | Plextor PX-512M9PEGN 512 GB |
Display(s) | 1920x1200; 100 Hz |
Case | Fractal Design North XL |
Audio Device(s) | SSL2 |
Software | Windows 10 Pro 22H2 |
Benchmark Scores | i've got a shitload of them in 15 years of TPU membership |
It's been quite a while since i posted some of my last benchmarks. It's so nice not only to post random things here in TPU from time to time, but to return with something really BIG. This also marks my 10th anniversary both here in TPU and since i first started bench-marking generally. Grab some pop corn, because these RAM benchmarks will be long and boring... ;D
Since i am sitting on Intel X99, i thought it would be a nice idea to present all the possible memory performance benefits for this platform in different configurations, as most of the ''revelations'' ''revealed'' here will be also ''relevant'' for Intel X299 systems, as well as for X79 systems, as they both support quad channel and ECC memory. For this memory comparison specifically i am venturing in to the world of workstation and productivity tasks. Folks who work with video converting, financial, mathematic and office calculations, compiling, archiving, 3D building, rendering and product & technological design will find their needed memory sweet spot.
Best performance with the least cash and time investment is the key aspect in any process design, same criteria applies for the high end Intel workstation tasks.
OBJECTIVES
1. Do you need ECC memory or just standard memory? ECC memory is not needed for workstations, but if the price and performance is adequate, then it is highly recommended as it will increase system stability and make the system more fault proof. I will test the difference in performance between ECC DDR4 and non-ECC DDR4 modules of the same speed, latencies and quantity, and leave the pricing to you.
2. How much benefit would quad channel memory bring over dual channel and single channel? There is an undeniable benefit in increased bandwidth, but not all programs will benefit. Thus the second task is to find out which workstation tasks benefit the most.
3. Quantity of needed memory is the most straightforward question you can imagine. I won't even bother with 8 GB as a starting point and i am sure you will agree. 16 GB vs. 32 GB will be the focus here, and while i also tested 48 GB of RAM, i can tell you straight away, that not a single tested program benefited going from 32 GB quad channel to 48 GB quad channel (1 % margin of error), therefore i won't bother posting 48 GB results just to save time. That does not mean, however, that 48 GB are not needed. Adobe After Effects is the prime example when usage exceeds 32 GB, and there are more examples, but not in my test.
4. Finally, for those who want to use non-ECC memory, the most important thing is overclocking. Aggressive memory timings vs. jedec standard might justify the pricing of faster RAM, and that definitely has to be checked.
My focus for this test is DDR4 ECC memory performance analysis (and comparison with non-ECC DDR4 memory), thus no Core i7 CPU would suffice here. I also did not want to test a CPU with hyper-threading, as virtual cores sometimes mess up thread priorities in some programs. To avoid that, a pure 10 core Haswell-EP Xeon E5-2663 V3 with 25 MB L3 cache (Genuine Intel CPU) will be the workhorse. All 10 cores work at 3.1 GHz in turbo mode. By CPU specification the supported memory is from 1600 MHz to 2133 MHz only, thus a 3000 MHz DDR4 kit, supported by X99 system, will still work at 2133 MHz max! That's why the only way to overclock memory is to lower the memory latencies! I found the multi-threaded performance of this Xeon E5-2663 V3 to be actually identical to the Core i5 8600K overclocked to 5.0 GHz, which i tested before, so you get the picture. Asrock X99 Extreme4 motherboard, BIOS version P3.80.
TESTED RAM CONFIGURATIONS (all memory is unbuffered and double sided)
1. Crucial 16 GB SINGLE CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
2. Crucial 16 GB DUAL CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
3. Crucial 32 GB DUAL CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
4. Crucial 32 GB QUAD CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
5. Crucial 48 GB QUAD CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300 (will not be included)
6. G.Skill 16 GB DUAL CHANNEL DDR4 non-ECC 2133 MHz CL15-15-15-36-300
7. G.Skill 16 GB DUAL CHANNEL DDR4 non-ECC 2133 MHz CL11-12-12-30-300 overclocked
All the tested programs as well as their temporary output files and product appearance paths are located on an Samsung 860 Pro 1 TB SATA SSD. Windows 10 v1903 installed. There have been tested synthetic, semi-synthetic and custom programs, including full Passmark 9 CPU and RAM separate tests, and full SPEC Workstation CPU separate tests.
VIDEO PRESENTATION
-----------------------------------------------------------------------------------------------------
Cinebench R20
The values have been rounded to the tenth, as the results were in the range from 2780 to 2820 points only, which is in the margin of error zone.
V-ray 4.10.6
With V-ray we see something interesting. In 5 runs for each tested config, the non-ECC value RAM seems to perform the worst, even though the timings and speed are the same. Some runs were as low as 8100... Even the overclocked non-ECC RAM seems here only catching up to DDR4 ECC.
Luxmark C++
C++ language compiling seems to benefit from non-ECC RAM with with the fastest timings, but any RAM config will do.
3DPM 2.1
A 3D movement algorithm simulation takes no preference in RAM config. Calculated in millions operations per second.
7-ZIP 19.0 compression at 16 GB RAM allocation
When it comes to file compression, everything matters - quantity, bandwidth, timings... Calculated in millions instructions per second.
7-ZIP 19.0 decompression at 16 GB RAM allocation
But it ain't so true in file decompression. Here, only timings play the most important role. Calculated in millions instructions per second.
Corona 1.3
Even single a channel 16 GB DIMM does not suffer rendering issues.
Blender benchmark 1.0 beta2 quick CPU
And the same goes for Blender. Granted, i've only tested the recently released beta quick CPU test, so i can not guarantee the same result for independent custom demos. However, and this is important, the 5 most popular demos: BMW, CLASSROOM, BARCELONA PAVILION, SPLASH, and COSMOS LAUNDROMAT DEMO did not show any significant difference between 16 GB and 32 GB RAM, only minor. However, i can assure you, that 3DS Max will significantly benefit from 32 GB RAM over 16 GB. This has been confirmed in my previous CPU benchmark, in which i did not talk about memory advantages, but i say it now, so that you take considerations.
Handbrake 1.2.2
For this test i am using a custom 8.04 GB, raw, avi, 3820X2160 resolution, video file, that i convert H.265, mkv, 8 bit, CFR, 25 FPS, at 64 Mp/s, at single pass, at auto encoder profile, at slower preset. With that being said, even though 4K conversion fully utilizes multiple cores, there is no difference in memory configs.
Y-cruncher 0.7.7.9501 11.8 GB RAM allocation Pi numbers multi-threaded
Well, whatever this Pi calculation deals with, it surely likes bandwidth. Difference between quad and dual channel is not that big, but single channel memory users would suffer greatly. Very interesting is the fact that even the overclocked non-ECC memory has got nothing on ECC memory, and overclocking timings bring very little performance boost. The test is very accurate and multiple repeats show the same result to the second.
Microsoft Excel Pro Plus 2016
Custom sheet calculation. All RAM configs actually calculate the bench within 0.5 second difference, with single channel DDR4 ECC RAM at near 51 s, and overclocked non-ECC RAM at near 50 s.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Before going into anything, my Xeon E5-2663 V3 at 3.2 GHz OC in Passmark CPUmark scored 15500 points, which is about the same result for stock Core i7 8700K. This is the reason why the general Passmark score SUCKS, as it absolutely reveals no difference between processors (Core i7 8700K is faster, but how would you know from Passmark?). Dissecting Passmark to separate tests is the only way to find what is what and this applies to memory too.
Passmark 9 CPU integer math
Looks ok so far.
Passmark 9 CPU prime numbers
These prime numbers have some ECC affinity or what? Whenever i see ECC RAM beating standard RAM i redo the test at least 10 times, but eventually get the same results. Whoever said that ECC RAM are inferior in performance to non-ECC RAM was so so wrong, as this is not the first time we see this bizzare occurrence. Some 20 tests in repeat for each RAM config were done to confirm this.
Passmark 9 CPU physics
A major win for quad channel memory and ECC RAM in general. Each retesting gives 5 to 10 % different resutls, but the best scores for each RAM config out of 20+ runs was checked.
Passmark 9 CPU floating point math
One of the most accurate tests in Passmark shows variation of less than 0.5 % with each retest.
Passmark 9 CPU extended instructions
The weirdest result ever. The 704 points for each ECC RAM config would repeat forever, and then go to like 703 for once... For non-ECC RAM it's the same way with 585 points at max. There is no difference between single and quad channel, no difference in quantity, no difference in memory timings, just this absolutely bizzare radical ECC vs. non-ECC RAM differentiation. Can anybody put some insight WTF is going on here?
Passmark 9 CPU encryption
Another stable test.
Passmark 9 CPU sorting
Go ahead and ''sort'' how the overclocked RAM lost to value RAM.
Passmark 9 RAM database operations
Previously we evaluated how different RAM configs would alter the CPU performance, now we look into the raw RAM performance itself.
Passmark 9 RAM read cached
Big performance jump due to improved memory timings in non-ECC RAM is noticed. And yes, single channel seems to be the fastest config???
Passmark 9 RAM read uncached
No mater how many times i tried to repeat the 32 GB RAM quad channel result, it would only be around 11000 points, while 32 GB RAM dual channel result would always stay at 12000 points. Perhaps very sensitive to latencies?
Passmark 9 RAM write
Writing is another thing. Single channel RAM suffers a huge performance penalty.
Passmark 9 RAM threaded
And now we have the absolutely best result for quad channel memory users.
Passmark 9 RAM latency
While the primary RAM timings are clearly defined, there is so much more to that. This is the proper test to see how really fast the RAM response is. So even though 32 GB quad channel RAM has the same timings as 32 GB dual channel RAM, quad channel config brings up a slightly increased latency.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SPEC - standard performance evaluation corporation provides a wide variety of apps f to test the performance of the computer parts simulating the most popular and demanding third party programs in the world. I am using the SPEC Workstation benchmark, which has CPU, GPU and storage tests. I will be using all of the CPU tests, except, Z-ZIP, Handbrake, Blender and Luxmark, as those have been tested independently. This time around i won't comment on anything. These SPEC tests take an incredibly long amount of time to roll out, but they are quite accurate, with the exception of FSI, which i excluded, because the variations in results were too great.
SPEC Workstation CalculiX
SPEC Workstation WPCcfd
SPEC Workstation rodiniaCFD
SPEC Workstation lammps
SPEC Workstation namd
SPEC Workstation rodiniaLifeSci
SPEC Workstation Convolution
SPEC Workstation WWTF
SPEC Workstation Kirchhoff
SPEC Workstation poisson
SPEC Workstation srmp
SPEC Workstation octave
SPEC Workstation python36
----------------------------------------------------------------------------------------------------------------------
As i said earlier, the difference between 32 GB quad channel and 48 GB quad channel was too small to showcase anything.
In many tests DDR4 ECC managed to outperform the standard non-ECC DDR4 RAM. Out of interest i will make additional tests with a different 2X8 GB non-ECC DDR4 kit and will report if anything interesting.
In the future i will make productivity benchmarks 2.0, which will be the follow up to the popular test, done previously. This has been of the most brain-squashing benchmarks i have ever done here in TPU. Please don't demand ''do that and that''. Just be patient.
So now you know what you need. Even if you are still using a Xeon with X79 and DDR3, many things should nicely apply for your system. X299 users will additionally benefit from higher frequency DDR4.
Since i am sitting on Intel X99, i thought it would be a nice idea to present all the possible memory performance benefits for this platform in different configurations, as most of the ''revelations'' ''revealed'' here will be also ''relevant'' for Intel X299 systems, as well as for X79 systems, as they both support quad channel and ECC memory. For this memory comparison specifically i am venturing in to the world of workstation and productivity tasks. Folks who work with video converting, financial, mathematic and office calculations, compiling, archiving, 3D building, rendering and product & technological design will find their needed memory sweet spot.
Best performance with the least cash and time investment is the key aspect in any process design, same criteria applies for the high end Intel workstation tasks.
OBJECTIVES
1. Do you need ECC memory or just standard memory? ECC memory is not needed for workstations, but if the price and performance is adequate, then it is highly recommended as it will increase system stability and make the system more fault proof. I will test the difference in performance between ECC DDR4 and non-ECC DDR4 modules of the same speed, latencies and quantity, and leave the pricing to you.
2. How much benefit would quad channel memory bring over dual channel and single channel? There is an undeniable benefit in increased bandwidth, but not all programs will benefit. Thus the second task is to find out which workstation tasks benefit the most.
3. Quantity of needed memory is the most straightforward question you can imagine. I won't even bother with 8 GB as a starting point and i am sure you will agree. 16 GB vs. 32 GB will be the focus here, and while i also tested 48 GB of RAM, i can tell you straight away, that not a single tested program benefited going from 32 GB quad channel to 48 GB quad channel (1 % margin of error), therefore i won't bother posting 48 GB results just to save time. That does not mean, however, that 48 GB are not needed. Adobe After Effects is the prime example when usage exceeds 32 GB, and there are more examples, but not in my test.
4. Finally, for those who want to use non-ECC memory, the most important thing is overclocking. Aggressive memory timings vs. jedec standard might justify the pricing of faster RAM, and that definitely has to be checked.
My focus for this test is DDR4 ECC memory performance analysis (and comparison with non-ECC DDR4 memory), thus no Core i7 CPU would suffice here. I also did not want to test a CPU with hyper-threading, as virtual cores sometimes mess up thread priorities in some programs. To avoid that, a pure 10 core Haswell-EP Xeon E5-2663 V3 with 25 MB L3 cache (Genuine Intel CPU) will be the workhorse. All 10 cores work at 3.1 GHz in turbo mode. By CPU specification the supported memory is from 1600 MHz to 2133 MHz only, thus a 3000 MHz DDR4 kit, supported by X99 system, will still work at 2133 MHz max! That's why the only way to overclock memory is to lower the memory latencies! I found the multi-threaded performance of this Xeon E5-2663 V3 to be actually identical to the Core i5 8600K overclocked to 5.0 GHz, which i tested before, so you get the picture. Asrock X99 Extreme4 motherboard, BIOS version P3.80.
TESTED RAM CONFIGURATIONS (all memory is unbuffered and double sided)
1. Crucial 16 GB SINGLE CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
2. Crucial 16 GB DUAL CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
3. Crucial 32 GB DUAL CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
4. Crucial 32 GB QUAD CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300
5. Crucial 48 GB QUAD CHANNEL DDR4 ECC 2133 MHz CL15-15-15-36-300 (will not be included)
6. G.Skill 16 GB DUAL CHANNEL DDR4 non-ECC 2133 MHz CL15-15-15-36-300
7. G.Skill 16 GB DUAL CHANNEL DDR4 non-ECC 2133 MHz CL11-12-12-30-300 overclocked
All the tested programs as well as their temporary output files and product appearance paths are located on an Samsung 860 Pro 1 TB SATA SSD. Windows 10 v1903 installed. There have been tested synthetic, semi-synthetic and custom programs, including full Passmark 9 CPU and RAM separate tests, and full SPEC Workstation CPU separate tests.
VIDEO PRESENTATION
-----------------------------------------------------------------------------------------------------
Cinebench R20

The values have been rounded to the tenth, as the results were in the range from 2780 to 2820 points only, which is in the margin of error zone.
V-ray 4.10.6

With V-ray we see something interesting. In 5 runs for each tested config, the non-ECC value RAM seems to perform the worst, even though the timings and speed are the same. Some runs were as low as 8100... Even the overclocked non-ECC RAM seems here only catching up to DDR4 ECC.
Luxmark C++

C++ language compiling seems to benefit from non-ECC RAM with with the fastest timings, but any RAM config will do.
3DPM 2.1

A 3D movement algorithm simulation takes no preference in RAM config. Calculated in millions operations per second.
7-ZIP 19.0 compression at 16 GB RAM allocation

When it comes to file compression, everything matters - quantity, bandwidth, timings... Calculated in millions instructions per second.
7-ZIP 19.0 decompression at 16 GB RAM allocation

But it ain't so true in file decompression. Here, only timings play the most important role. Calculated in millions instructions per second.
Corona 1.3

Even single a channel 16 GB DIMM does not suffer rendering issues.
Blender benchmark 1.0 beta2 quick CPU

And the same goes for Blender. Granted, i've only tested the recently released beta quick CPU test, so i can not guarantee the same result for independent custom demos. However, and this is important, the 5 most popular demos: BMW, CLASSROOM, BARCELONA PAVILION, SPLASH, and COSMOS LAUNDROMAT DEMO did not show any significant difference between 16 GB and 32 GB RAM, only minor. However, i can assure you, that 3DS Max will significantly benefit from 32 GB RAM over 16 GB. This has been confirmed in my previous CPU benchmark, in which i did not talk about memory advantages, but i say it now, so that you take considerations.
Handbrake 1.2.2

For this test i am using a custom 8.04 GB, raw, avi, 3820X2160 resolution, video file, that i convert H.265, mkv, 8 bit, CFR, 25 FPS, at 64 Mp/s, at single pass, at auto encoder profile, at slower preset. With that being said, even though 4K conversion fully utilizes multiple cores, there is no difference in memory configs.
Y-cruncher 0.7.7.9501 11.8 GB RAM allocation Pi numbers multi-threaded

Well, whatever this Pi calculation deals with, it surely likes bandwidth. Difference between quad and dual channel is not that big, but single channel memory users would suffer greatly. Very interesting is the fact that even the overclocked non-ECC memory has got nothing on ECC memory, and overclocking timings bring very little performance boost. The test is very accurate and multiple repeats show the same result to the second.
Microsoft Excel Pro Plus 2016

Custom sheet calculation. All RAM configs actually calculate the bench within 0.5 second difference, with single channel DDR4 ECC RAM at near 51 s, and overclocked non-ECC RAM at near 50 s.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Before going into anything, my Xeon E5-2663 V3 at 3.2 GHz OC in Passmark CPUmark scored 15500 points, which is about the same result for stock Core i7 8700K. This is the reason why the general Passmark score SUCKS, as it absolutely reveals no difference between processors (Core i7 8700K is faster, but how would you know from Passmark?). Dissecting Passmark to separate tests is the only way to find what is what and this applies to memory too.
Passmark 9 CPU integer math

Looks ok so far.
Passmark 9 CPU prime numbers

These prime numbers have some ECC affinity or what? Whenever i see ECC RAM beating standard RAM i redo the test at least 10 times, but eventually get the same results. Whoever said that ECC RAM are inferior in performance to non-ECC RAM was so so wrong, as this is not the first time we see this bizzare occurrence. Some 20 tests in repeat for each RAM config were done to confirm this.
Passmark 9 CPU physics

A major win for quad channel memory and ECC RAM in general. Each retesting gives 5 to 10 % different resutls, but the best scores for each RAM config out of 20+ runs was checked.
Passmark 9 CPU floating point math

One of the most accurate tests in Passmark shows variation of less than 0.5 % with each retest.
Passmark 9 CPU extended instructions

The weirdest result ever. The 704 points for each ECC RAM config would repeat forever, and then go to like 703 for once... For non-ECC RAM it's the same way with 585 points at max. There is no difference between single and quad channel, no difference in quantity, no difference in memory timings, just this absolutely bizzare radical ECC vs. non-ECC RAM differentiation. Can anybody put some insight WTF is going on here?
Passmark 9 CPU encryption

Another stable test.
Passmark 9 CPU sorting

Go ahead and ''sort'' how the overclocked RAM lost to value RAM.
Passmark 9 RAM database operations

Previously we evaluated how different RAM configs would alter the CPU performance, now we look into the raw RAM performance itself.
Passmark 9 RAM read cached

Big performance jump due to improved memory timings in non-ECC RAM is noticed. And yes, single channel seems to be the fastest config???
Passmark 9 RAM read uncached

No mater how many times i tried to repeat the 32 GB RAM quad channel result, it would only be around 11000 points, while 32 GB RAM dual channel result would always stay at 12000 points. Perhaps very sensitive to latencies?
Passmark 9 RAM write

Writing is another thing. Single channel RAM suffers a huge performance penalty.
Passmark 9 RAM threaded

And now we have the absolutely best result for quad channel memory users.
Passmark 9 RAM latency

While the primary RAM timings are clearly defined, there is so much more to that. This is the proper test to see how really fast the RAM response is. So even though 32 GB quad channel RAM has the same timings as 32 GB dual channel RAM, quad channel config brings up a slightly increased latency.
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SPEC - standard performance evaluation corporation provides a wide variety of apps f to test the performance of the computer parts simulating the most popular and demanding third party programs in the world. I am using the SPEC Workstation benchmark, which has CPU, GPU and storage tests. I will be using all of the CPU tests, except, Z-ZIP, Handbrake, Blender and Luxmark, as those have been tested independently. This time around i won't comment on anything. These SPEC tests take an incredibly long amount of time to roll out, but they are quite accurate, with the exception of FSI, which i excluded, because the variations in results were too great.
SPEC Workstation CalculiX

SPEC Workstation WPCcfd

SPEC Workstation rodiniaCFD

SPEC Workstation lammps

SPEC Workstation namd

SPEC Workstation rodiniaLifeSci

SPEC Workstation Convolution

SPEC Workstation WWTF

SPEC Workstation Kirchhoff

SPEC Workstation poisson

SPEC Workstation srmp

SPEC Workstation octave

SPEC Workstation python36

----------------------------------------------------------------------------------------------------------------------
As i said earlier, the difference between 32 GB quad channel and 48 GB quad channel was too small to showcase anything.
In many tests DDR4 ECC managed to outperform the standard non-ECC DDR4 RAM. Out of interest i will make additional tests with a different 2X8 GB non-ECC DDR4 kit and will report if anything interesting.
In the future i will make productivity benchmarks 2.0, which will be the follow up to the popular test, done previously. This has been of the most brain-squashing benchmarks i have ever done here in TPU. Please don't demand ''do that and that''. Just be patient.
So now you know what you need. Even if you are still using a Xeon with X79 and DDR3, many things should nicely apply for your system. X299 users will additionally benefit from higher frequency DDR4.
Last edited: