• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Workstation benchmarks: dual channel vs. quad channel RAM

Joined
Feb 22, 2009
Messages
791 (0.13/day)
Processor Ryzen 7 5700X3D
Motherboard Asrock B550 PG Velocita
Cooling Thermalright Silver Arrow 130
Memory G.Skill 4000 MHz DDR4 32 GB
Video Card(s) XFX Radeon RX 7800XT 16 GB
Storage Plextor PX-512M9PEGN 512 GB
Display(s) 1920x1200; 100 Hz
Case Fractal Design North XL
Audio Device(s) SSL2
Software Windows 10 Pro 22H2
Benchmark Scores i've got a shitload of them in 15 years of TPU membership
How many of you who have HEDT platforms have their memory placed in quad channel layout, if not because this is needed, but rather because it is ''cool'' to fill up your slots? Most folks doing everyday tasks do not actually need quad channel RAM - basic rendering, video encoding and gaming tasks don't really benefit from extra bandwidth and will see minimal to no gains. This is why the Ryzen 9 3950X & 5950X, released for mainstream platforms were so successful in the first place - most people who were building workstations or anything that ''sort of gimmicky'' would mostly do rendering and video production stuff, where the Ryzen chips suffered no penalty with their dual channel memory layout and made the LGA2066 competition look very bad price/performance wise, just as Intel processors did not benefit from quad channel memory in the applications which were selected for the reviews. However, be sure that there are plenty of less common, but important applications, which will benefit massively from quad channel layout, and i ain't just talking server stuff. This time around i am not going to rant how the reviewers of Ryzen 9 3950X did not include enough proper workstation apps in their benchmarks those would actually benefit quad channel systems - i will leave this for another thread. For now it is a pure RAM vs. RAM fight, and a benchmark for Techpowerup, which i have not done in a while.
That being said let's move on to the test which will be based on:

Windows 10
Core i9 9960X 16 cores/32 threads (4.0 GHz all core turbo)
Asrock X299 Taichi
2X16 GB DDR4 3200 MHz CL16-18-18-36-2N SINGLE RANK 1Rx8 RAM in dual channel mode
4X8 GB DDR4 3200 MHz CL16-16-16-36-2N SINGLE RANK 1Rx8 RAM in quad channel mode

It is very important that both memory configurations would only differ in their final bandwidth (128 bits vs 256 bits) and nothing more, meaning that memory size, memory ranks and memory timings have to be identical. The timings are for all sticks (primary, secondary and tertiary): 16-18-18-36-2-24-560-7-7-4-12-34-16-12480-8-4-7-4-7-1-4-1-1-5-6-6-6-1-1-1-3-5-3-3.


3D PARTICLE MOVEMENT 2.1 {higher score = better}
3DPM 2.1.jpg

Six repeated tests, from which the best runs are selected show no real diversity with each run being some 1 % different from the previous.


7-ZIP 19.00 {higher score = better}
7-zip 19.00 compression.jpg

5 to 10 runs selected, results varying within 1 to 2 % show massive performance gains in compression tasks when using quad channel RAM.


7-ZIP 19.00 {higher score = better}
7-zip 19.00 decompression.jpg

When doing decompression tasks this time around there is a constant difference (up to 2%) between dual and quad channel configurations, slightly favoring quad channel RAM.


BLENDER 3.5.0 {higher score = better}
Blender 3.5.0 classroom.jpg

Blender benchmark 3.1 uses 3 scenes for both CPU or GPU rendering and can simulate older or newer Blender versions. Using the current 3.5.0 version all 3 tests show absolutely no difference between dual and quad channel RAM, and all 3 scenes display consistent equal results, therefore i won't even bother showing the other two scenes.


V-RAY 4.10.7 {higher score = better}
V-ray 4.10.7.jpg

V-ray, like the name suggests, is a ray traced renderer, that can be used for CPU, GPU or heterogeneous. This time around when using CPU we see the results within the margin of variation - less than 1 %.


LUXMARK 3.1 {higher score = better}
Luxmark 3.1 hotel.jpg

Using another proprietary renderer with the help of C++ libraries tells the same story. The ''ball'' scene, obviously, is there too, but i won't bother showing it as nothing will change.


HANDBRAKE 1.6.1 {lower score = better}
Handbrake 1.6.1 2160p.jpg

In this case i am encoding a custom RAW 8 GB 3840x2160 video to an output file using H.265, 8-bit, 3840x2160, 64 Mbp/s compression settings. I love handbrake as it is simple to use and provides incredibly consistent results with no margin or error - if the first run finishes at 840 second mark, the following runs end at the same time! This actually means that quad channel RAM config did the job 5 seconds faster here, but this clearly does not mean shit as it is a small difference.


Passmark 11 {higher score = better}
Passmark 11 CPU extended instructions.jpg

I am selecting some of the Passmark's tests which won't be duplicated in other benchmarks. After many runs CPU exteneded instructions settled the best results at over 2 % difference in a rather negligable favor for quad channel RAM


Passmark 11 {higher score = better}
Passmark 11 CPU physics.jpg

Now, i am not sure what kind of physics are we talking here, but clearly this is important and the quad channel RAM users would benefit almost twice fold.


Passmark 11 {higher score = better}
Passmark 11 CPU prime numbers.jpg

I don't really have any comment on this, but who's ever the ''big time'' mathematician should know what's best for him.


Passmark 11 {higher score = better}
Passmark 11 CPU sorting.jpg

After repeated sortings the top scores settled at around 2.5 % in favor of quad channel RAM.


Sandra Lite 2021 {higher score = better}
Sandra Lite 2021 CPU arithmetic.jpg

Another synthetic benchmark doing some CPU arithmetics - pretty consistent results after several repeats and no difference in performance.


Sandra Lite 2021 {higher score = better}
Sandra Lite 2021 CPU financial analysis.jpg

This is a case where floating point operations are not influenced by the increased memory bandwidth, but more to come.


Sandra Lite 2021 {higher score = better}
Sandra Lite 2021 CPU scientific analysis.jpg

And this is the case where floating point operations are heavily influenced by the increased memory bandwidth.


Sandra Lite 2021 {higher score = better}
Sandra Lite 2021 CPU cryptography.jpg

Encrypting/decrypting files is not exactly a productivity workstation task, it's rather one of the most important server features, ''glad we have other Intel systems that support Xeon CPU and ECC RAM''...


Y-cruncher 0.7.7.9501 {lower score = better}
Y-cruncher 0.7.7.jpg

I wonder how would the Ryzen 9 processors perform here with their dual channel mainstream motherboards? Another reason for HEDT.



We are now moving on to the most important benchmarks in this thread - the SPEC Workstation standard, the proper tool to distinguish the HEDT platform from the mainstream. I am not going to comment under the pictures and will let the results speak for themselves. After all i do not posses the needed competency to explain what each SPEC benchmark does. I did exclude some of SPEC's tests as they were redundant and present in the previous programs. These tests are made from 3 runs, with the first run results always ignored as during the first run information and data is being filled into CPU cache, therefore the following runs are always more similar among themselves and faster too - i've noticed this behavior with the majority of SPEC's tests, and that kinda makes sense in a way.

SPEC WORKSTATION 3.02 {higher score = better}
SPEC 3.02 CalculiX.jpg


SPEC 3.02 Convolution.jpg


SPEC 3.02 FFTW.jpg


SPEC 3.02 FSI.jpg


SPEC 3.02 Kirchhoff.jpg


SPEC 3.02 lammps.jpg


SPEC 3.02 namd.jpg


SPEC 3.02 octave.jpg


SPEC 3.02 poisson.jpg


SPEC 3.02 python36.jpg


SPEC 3.02 rodiniaCFD.jpg


SPEC 3.02 rodiniaLifeSci.jpg


SPEC 3.02 srmp.jpg


SPEC 3.02 WPCcfd.jpg



The benefit of having quad channel RAM in your computer can not be denied, particurlarly looking at the SPEC Workstation results, where all the tests showed improvements vs. dual channel RAM layout - from minor to major. But such systems are not for everyone. If you are a casual photo & video editor, or a 3D animator, you don't need to look at these HEDT platforms - they will offer no real advantage. But if you are doing anything that the SPEC standard offers in it's benchmark suite, heck if you are archiving and encrypting/decrypting massive chunks of data, working with whatever the hell those Pi, Prime and floating point numbers mean - it's a no brainer to assemble a quad channel HEDT platform.

To be edited...
 
Last edited:
If a CPU supports quad-channel natively, why not take advantage of it? In some ways 2x dual rank is similar to 4x single rank because the total amount of ranks are the same. For ranks that share the same traces, the memory controller has to perform fly-by routing, which can / will increase latency, thus be slower.
 
If a CPU supports quad-channel natively, why not take advantage of it?
This is what I say. But of course, the budget is a factor.
 
In some ways 2x dual rank is similar to 4x single rank because the total amount of ranks are the same. For ranks that share the same traces, the memory controller has to perform fly-by routing, which can / will increase latency, thus be slower.

Yes, good point that i have missed! For personal interest (seeing as my benchmark has not got even single fucking thank you) i i will simulate that scenario for myself as well.
 
Yes, good point that i have missed! For personal interest (seeing as my benchmark has not got even single fucking thank you) i i will simulate that scenario for myself as well.

Amazing stuff. The lack of thank yous was cause I missed your thread. It must have gotten buried :(

DDR5 partially addresses your concern for high-core-count mainstream CPUs with the dual 32-bit channel per stick added to the significantly improved bandwidth. I had it installed so ran it quick, this is DDR5-6400 at a tight C30, although, this CPU is ehhh... particularly fast to begin with, so I've no idea how that affects 7-zip in particular:

7zip.png


I used to be an avid HEDT buyer, even with the eventual drawbacks, solid platforms that always performed well... my last was X99 with Haswell-EP4S E5-4669 v3, used it for 4 years and I would have kept it if the IPC hadn't fallen so far behind. Sadly, we won't be seeing the return of HEDT any time soon. It's not a profitable venture for AMD or Intel right now. They'd rather reserve its strengths for the professional market, and avoid us from buying them on the cheap down the road (killing the homelab market).
 
Wow, that's some serious testing goin on there, I thank you for the data, which, in summary, seems to indicate that quad channel is way better for some things, a little better for others, and about the same as dual channel for other things....

Benchmarks are great for establishing/confirming specs & performance claims, but probably the main/real question is how this translates (or not) into actual, every day REAL world usage scenarios..

And as already stated, if your machine supports quad channel, why not go for it & just be happy to know that overall, you stuff is runnin a tad faster than everybody elses :D
 
How many of you who have HEDT platforms have their memory placed in quad channel layout, if not because this is needed, but rather because it is ''cool'' to fill up your slots? Most folks doing everyday tasks do not actually need quad channel RAM - basic rendering, video encoding and gaming tasks don't really benefit from extra bandwidth and will see minimal to no gains. This is why the Ryzen 9 3950X & 5950X, released for mainstream platforms were so successful in the first place - most people who were building workstations or anything that ''sort of gimmicky'' would mostly do rendering and video production stuff, where the Ryzen chips suffered no penalty with their dual channel memory layout and made the LGA2066 competition look very bad price/performance wise, just as Intel processors did not benefit from quad channel memory in the applications which were selected for the reviews. However, be sure that there are plenty of less common, but important applications, which will benefit massively from quad channel layout, and i ain't just talking server stuff. This time around i am not going to rant how the reviewers of Ryzen 9 3950X did not include enough proper workstation apps in their benchmarks those would actually benefit quad channel systems - i will leave this for another thread. For now it is a pure RAM vs. RAM fight, and a benchmark for Techpowerup, which i have not done in a while.
That being said let's move on to the test which will be based on:

Windows 10
Core i9 9960X 16 cores/32 threads (4.0 GHz all core turbo)
Asrock X299 Taichi
2X16 GB DDR4 3200 MHz CL16-18-18-36-2N SINGLE RANK 1Rx8 RAM in dual channel mode
4X8 GB DDR4 3200 MHz CL16-16-16-36-2N SINGLE RANK 1Rx8 RAM in quad channel mode

It is very important that both memory configurations would only differ in their final bandwidth (128 bits vs 256 bits) and nothing more, meaning that memory size, memory ranks and memory timings have to be identical. The timings are for all sticks (primary, secondary and tertiary): 16-18-18-36-2-24-560-7-7-4-12-34-16-12480-8-4-7-4-7-1-4-1-1-5-6-6-6-1-1-1-3-5-3-3.


3D PARTICLE MOVEMENT 2.1 {higher score = better}
View attachment 300535
Six repeated tests, from which the best runs are selected show no real diversity with each run being some 1 % different from the previous.


7-ZIP 19.00 {higher score = better}
View attachment 300536
5 to 10 runs selected, results varying within 1 to 2 % show massive performance gains in compression tasks when using quad channel RAM.


7-ZIP 19.00 {higher score = better}
View attachment 300538

When doing decompression tasks this time around there is a constant difference (up to 2%) between dual and quad channel configurations, slightly favoring quad channel RAM.


BLENDER 3.5.0 {higher score = better}
View attachment 300541
Blender benchmark 3.1 uses 3 scenes for both CPU or GPU rendering and can simulate older or newer Blender versions. Using the current 3.5.0 version all 3 tests show absolutely no difference between dual and quad channel RAM, and all 3 scenes display consistent equal results, therefore i won't even bother showing the other two scenes.


V-RAY 4.10.7 {higher score = better}
View attachment 300545
V-ray, like the name suggests, is a ray traced renderer, that can be used for CPU, GPU or heterogeneous. This time around when using CPU we see the results within the margin of variation - less than 1 %.


LUXMARK 3.1 {higher score = better}
View attachment 300551
Using another proprietary renderer with the help of C++ libraries tells the same story. The ''ball'' scene, obviously, is there too, but i won't bother showing it as nothing will change.


HANDBRAKE 1.6.1 {lower score = better}
View attachment 300550
In this case i am encoding a custom RAW 8 GB 3840x2160 video to an output file using H.265, 8-bit, 3840x2160, 64 Mbp/s compression settings. I love handbrake as it is simple to use and provides incredibly consistent results with no margin or error - if the first run finishes at 840 second mark, the following runs end at the same time! This actually means that quad channel RAM config did the job 5 seconds faster here, but this clearly does not mean shit as it is a small difference.


Passmark 11 {higher score = better}
View attachment 300552

I am selecting some of the Passmark's tests which won't be duplicated in other benchmarks. After many runs CPU exteneded instructions settled the best results at over 2 % difference in a rather negligable favor for quad channel RAM


Passmark 11 {higher score = better}
View attachment 300553

Now, i am not sure what kind of physics are we talking here, but clearly this is important and the quad channel RAM users would benefit almost twice fold.


Passmark 11 {higher score = better}
View attachment 300554

I don't really have any comment on this, but who's ever the ''big time'' mathematician should know what's best for him.


Passmark 11 {higher score = better}
View attachment 300556
After repeated sortings the top scores settled at around 2.5 % in favor of quad channel RAM.


Sandra Lite 2021 {higher score = better}
View attachment 300558

Another synthetic benchmark doing some CPU arithmetics - pretty consistent results after several repeats and no difference in performance.


Sandra Lite 2021 {higher score = better}
View attachment 300559

This is a case where floating point operations are not influenced by the increased memory bandwidth, but more to come.


Sandra Lite 2021 {higher score = better}
View attachment 300560

And this is the case where floating point operations are heavily influenced by the increased memory bandwidth.


Sandra Lite 2021 {higher score = better}
View attachment 300561

Encrypting/decrypting files is not exactly a productivity workstation task, it's rather one of the most important server features, ''glad we have other Intel systems that support Xeon CPU and ECC RAM''...


Y-cruncher 0.7.7.9501 {lower score = better}
View attachment 300562
I wonder how would the Ryzen 9 processors perform here with their dual channel mainstream motherboards? Another reason for HEDT.



We are now moving on to the most important benchmarks in this thread - the SPEC Workstation standard, the proper tool to distinguish the HEDT platform from the mainstream. I am not going to comment under the pictures and will let the results speak for themselves. After all i do not posses the needed competency to explain what each SPEC benchmark does. I did exclude some of SPEC's tests as they were redundant and present in the previous programs. These tests are made from 3 runs, with the first run results always ignored as during the first run information and data is being filled into CPU cache, therefore the following runs are always more similar among themselves and faster too - i've noticed this behavior with the majority of SPEC's tests, and that kinda makes sense in a way.

SPEC WORKSTATION 3.02 {higher score = better}
View attachment 300563

View attachment 300564

View attachment 300565

View attachment 300566

View attachment 300567

View attachment 300568

View attachment 300569

View attachment 300570

View attachment 300571

View attachment 300572

View attachment 300574

View attachment 300575

View attachment 300576

View attachment 300577


The benefit of having quad channel RAM in your computer can not be denied, particurlarly looking at the SPEC Workstation results, where all the tests showed improvements vs. dual channel RAM layout - from minor to major. But such systems are not for everyone. If you are a casual photo & video editor, or a 3D animator, you don't need to look at these HEDT platforms - they will offer no real advantage. But if you are doing anything that the SPEC standard offers in it's benchmark suite, heck if you are archiving and encrypting/decrypting massive chunks of data, working with whatever the hell those Pi, Prime and floating point numbers mean - it's a no brainer to assemble a quad channel HEDT platform.

To be edited...
I think the main take-away from these results is that RAM performance Dual channel VS Quad channel is highly situational. If someone is going to be doing a task that needs the extra bandwidth quad, hex and octa channel system platforms can be very useful. For most people, Dual channel will be all they need.
 
(seeing as my benchmark has not got even single fucking thank you)
Okay lol. Interesting way to ask for praise.

Bedsides making data for a outdated platform, the principal is the same but other factors like CPU Cache and application matters too. It will be interesting what happens when X3D caches comes to Threadripper.
 
Thanks a lot for the comparison. A very interesting read!
 
Thank you!

-been pining for >dual-channel for years; I even have the parts on hand for a Nehalem-refresh dual-CPU triple-channel memory 'retro workstation'.

I've always 'felt' that more memory bandwidth gives longer legs to a platform's utility; even if 'gaming' performance is unaffected.
I'd say @Artas1984 here, has provided evidence supporting 'my feeling(s)'.
 
I think the only thing I would have done differently is lock the CPU frequency. Otherwise it will vary based on the workload and temps. On these older Intel CPUs, the Turbo isn't infinite.
 
There also are two reasons for going for a million channel memory.

The reason one is basically slow RAM sticks. Say, your RAM is obsolete rubbish from 2015 and it can't run faster than 2400 MHz. You will benefit from it being quad channel even more than in case of ~4 GHz sticks usual for latterdays.

The reason two is iGPU. Despite never seeing no quad channel iGPU-fueled system in my life I truly approve of the whole idea. iGPUs, especially the Vegas in the latest Ryzen APUs, are really suffering from low RAM bandwidth. Could be more competitive, have they 4 channel support.

And yeah, props for your efforts despite this info will probably never be useful for me. Your "colleagues" will benefit from the data for sure.
 
The reason two is iGPU. Despite never seeing no quad channel iGPU-fueled system in my life I truly approve of the whole idea. iGPUs, especially the Vegas in the latest Ryzen APUs, are really suffering from low RAM bandwidth. Could be more competitive, have they 4 channel support.
100% agree that IGPs would benefit greatly.
I had hoped we'd get to see a ThreadRipper APU; sadly, no.
 
100% agree that IGPs would benefit greatly.
I had hoped we'd get to see a ThreadRipper APU; sadly, no.

DDR5 systems as aforementioned would be sort of a quad channel iGPU implementation, AMD just needs to release 7000G series APUs on desktop for that to start spreading. It will perform admirably well this much I can tell.
 
Didn't see an attempt to run multiple vms with at least 16gbs ram each. Didn't see a lot of tests one would do on a true workstation. If you are saying quad channel is
pointless for a gaming system, I agree.
 
Okay lol. Interesting way to ask for praise.
I didn't take it that way. I think he was expecting more people to take interest. I too am surprised by the lack of interest here.

Seriously, think about it, how many times have people argued to death the single rank vs dual rank, 2 stick vs 4 sticks, lower speed vs higher speed RAM specs?
This subject goes right along with those topics. So what the hell?...

-been pining for >dual-channel for years
X79/X99/X299?!? Of course it has been a while..
 
Last edited:
I think the problem is that the OP filled pages and pages of benchmarking tool results ( a lot of work!) but without the correct synthesis, and possibly a wrong conclusion. If the workloads are CPU limited then Quad Channel doesnt help. If the memory access can be bound mostly within the cache, then Quad Channel doesn't help. If you have small memory and use pagefile a lot, Quad Channel doesnt help. But if you have memory intensive activities and a lot of RAM not using pagefile, than Quad Channel can double performance if memory is the bottleneck.

For video editting and photo work, Quad Channel can make the machine much more snappy. But not if you are H265 CPU constrained encoding. If you are using NVENC and holding the video clip in memory not on disk, then Quad Channel will improve performance greatly.
 
Back
Top