• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Post your Tiny Memory Benchmark results

acraft

New Member
Joined
Oct 22, 2020
Messages
5 (0.00/day)
Hi,

I recently bought AMD Ryzen 9 3950x - 32 GB system and I found that some of my applications are running 2x slower on this machine comparing with Intel i7-8750H 6 core - 32 GB system. 3950x's performance results are consistent what I see on the website. For example, AIDA64 cache and memory results are consistent. I tried to figure out why 3950x performing 2x slower and I found that some memory operations are much slower than intel.

I used tiny memory benchmark to figure out what is going on. i7-8750H 6-core CPU with 32 GB ram results are much higher than 3950x. I sometime think why AMD 3950 much cheaper than Intel and it looks they cut the power of some of the instructions to reduce the cost.

Here how you can compile and run the tiny memory benchmark under linux. (Or with WSL2 under windows also works) Instructions:

$ sudo apt update
$ sudo apt install clang make git
$ mkdir tmb
$ cd tmb
$ git clone https://github.com/ssvb/tinymembench .
$ CC=clang CFLAGS="-no-integrated-as" make
$ ./tinymembench


AMD Ryzen 9 3950x - 32 GB Ram

AMD 3950x Results.png


Here is the intel i7-8750H - 32 GB Ram Results:

Intel i7-8750H Results.png


Can you please run and post your results here.

AMD Ryzen 9 3950X System Spec:


AMD-ROG-1.png

AMD-ROG-2.png

AMD-ROG-3.png

AMD-ROG-4.png

AMD-ROG-5.png



INTEL i7-8750H System Spec:


Intel-ROG-1.jpg

Intel-ROG-2.jpg

Intel-ROG-3.jpg

Intel-ROG-4.jpg

Intel-ROG-5.jpg


Thanks!
 
Last edited:
If you provide your settings for your two sets (speed and timings) I will consider. In general no post considering performance will be answered unless you fill in your system specification.
 
IT does seem to be that the 3 series of AMD wasn't the best for handling memory timings etc so the results aren't much of a shock. Intel's CPU clock speed pushes them in first place for that...

You'll need to put a little more information down for what cooling, RAM, motherboards you use for the fact that there's so many variences that the test would be pointless unless we know what your timings etc were. Can you not put a CPU-Z CPU, memory and motherboard tab with the results for better clarity?
 
3600XT + 32GB @ 3753Mhz CL16 ... Read will be full speed but all Write Operations should be half of a 3900x because of cut off CCX
tinymem.JPG
 
This is to be expected, each CCX can read 32 bytes/clock or write 16 bytes/clock. In total the two CCXs can achieve the same write performance like any other Intel CPU just not using a single thread because then the instructions are issued from just one CCX and therefore limited to 16 bytes/clock. I suspect AIDA64 is multi-threaded and this benchmark isn't, you need multiple threads (that are scheduled on different CCXs) to get the full throughout. It wasn't really a cost saving measure, it's just how the I/O was configured.
 
Last edited:
Try this !
Capturetrythid.PNG
Capture5900x.PNG
 

Attachments

  • CaptureGOTYALL.PNG
    CaptureGOTYALL.PNG
    915.3 KB · Views: 108
Last edited:
This is to be expected, each CCX can read 32 bytes/clock or write 16 bytes/clock. In total the two CCXs can achieve the same write performance like any other Intel CPU just not using a single thread because then the instructions are issued from just one CCX and therefore limited to 16 bytes/clock. I suspect AIDA64 is multi-threaded and this benchmark isn't, you need multiple threads (that are scheduled on different CCXs) to get the full throughout. It wasn't really a cost saving measure, it's just how the I/O was configured.

The application I used (not tiny memory benchmark) is multi-threaded. I tried on Intel 6-core system and Intel 32 core system. It's very scalable on Intel based systems. The application is 100% scalable, no disk operations, no wait on synchronizations etc.. i7-8750H runs 2x faster than AMD 3950x. It scales as expected on intel core system without any issues.

Here perf stat result which could be helpful:
Screen Shot 2020-10-22 at 4.23.20 PM.png


Just look at how bad the IPC is on AMD.
 
You should try to compare your memory using the same timings (ideally the same sticks). The Intel system is running tightier timings, this could explain at least some part of the difference...
 
The application I used (not tiny memory benchmark) is multi-threaded. I tried on Intel 6-core system and Intel 32 core system. It's very scalable on Intel based systems. The application is 100% scalable, no disk operations, no wait on synchronizations etc.. i7-8750H runs 2x faster than AMD 3950x. It scales as expected on intel core system without any issues.

Here perf stat result which could be helpful:
View attachment 172991

Just look at how bad the IPC is on AMD.
The IPC isn't bad on 3000 series Ryzen at all, it seems more likely that this particular benchmark uses an instruction that Ryzen doesn't have and that's what makes all the difference here
 
Back
Top