Tuesday, December 15th 2020

128-Core 2P AMD EPYC "Milan" System Benchmarked in Cinebench R23, Outputs Insane Numbers

AMD is preparing to launch its next-generation of EPYC processors codenamed Millan. Being based on the company's latest Zen 3 cores, the new EPYC generation is going to deliver a massive IPC boost, spread across many cores. Models are supposed to range anywhere from 16 to 64 cores, to satisfy all of the demanding server workloads. Today, thanks to the leak from ExecutableFix on Twitter, we have the first benchmark of a system containing two of the 64 core, 128 thread Zen 3 based EPYC Milan processors. Running in the 2P configuration the processors achieved a maximum boost clock of 3.7 GHz, which is very high for a server CPU with that many cores.

The system was able to produce a Cinebench R23 score of insane 87878 points. With that many cores, it is no wonder how it is done, however, we need to look at how does it fare against the competition. For comparison, the Intel Xeon Platinum 8280L processor with its 28 cores and 56 threads that boost to 4.0 GHz can score up to 49,876 points. Of course, the scaling to that many cores may not work very well in this example application, so we have to wait and see how it performs in other workloads before jumping to any conclusions. The launch date is unknown for these processors, so we have to wait and report as more information appears.
Sources: ExecutableFix on Twitter, ServeTheHome (Picture)
Add your own comment

25 Comments on 128-Core 2P AMD EPYC "Milan" System Benchmarked in Cinebench R23, Outputs Insane Numbers

#1
FinneousPJ
Looks like the scaling's dropped off quite significantly.
Posted on Reply
#2
Vayra86
'Dual die madness'

Ha. Ha. Ha. "Oh hey Intel, you still dabbling in the sub-100 core counts?" Tsk
FinneousPJ
Looks like the scaling's dropped off quite significantly.
I could imagine there's a teenie weenie bit of a RAM bottleneck here. Or maybe Cinebench went crazy and just produced the number 87878.
Posted on Reply
#3
FinneousPJ
Vayra86
'Dual die madness'

Ha. Ha. Ha. "Oh hey Intel, you still dabbling in the sub-100 core counts?" Tsk



I could imagine there's a teenie weenie bit of a RAM bottleneck here. Or maybe Cinebench went crazy and just produced the number 87878.
I wouldn't jump to that conclusion, but it could be a RAM bottleneck sure.
Posted on Reply
#4
geon2k2
FinneousPJ
Looks like the scaling's dropped off quite significantly.
It isn't very smart to test 2*128 threads with R23.
Why? Because you need each thread to be able to work, or a high number of threads will be idle.

I'm not sure how many tiles are there in R23 render scene but I would imagine the number to be quite low as it is not designed for such monsters.
Lets say the number of tiles is 240, in this case as that system can allocate 256 threads, there will be 16 threads which will not do anything, which results in lower score.
More than this some tiles are more complex than others. You probably noticed that rendering of the tiles in the center take much longer than the tiles on the sides.
This means that all the threads on the sides finish the work and all wait for few threads in the middle, which results again in very low scalability, and very low cpu utilization.

In order to be able to test such system reliably you probably to render the scene in 2500 smaller tiles, and then the scaling will be much better.
Posted on Reply
#5
KarymidoN
geon2k2
It isn't very smart to test 2*128 threads with R23.
Why? Because you need each thread to be able to work, or a high number of threads will be idle.

I'm not sure how many tiles are there in R23 render scene but I would imagine the number to be quite low as it is not designed for such monsters.
Lets say the number of tiles is 240, in this case as that system can allocate 256 threads, there will be 16 threads which will not do anything, which results in lower score.
More than this some tiles are more complex than others. You probably noticed that rendering of the tiles in the center take much longer than the tiles on the sides.
This means that all the threads on the sides finish the work and all wait for few threads in the middle, which results again in very low scalability, and very low cpu utilization.

In order to be able to test such system reliably you probably to render the scene in 2500 smaller tiles, and then the scaling will be much better.
R23 is really a better tool for that.
Posted on Reply
#6
geon2k2
KarymidoN
R23 is really a better tool for that.
This kind of tool is great indeed, however you need a version which is scalable for such high core count, otherwise you will get low cpu utilization and pretty useless results.
Posted on Reply
#7
cueman
ok,right.

7nm cpu and 128- core

let see after year how many core intel make its own 7nm cpu.

amds is so easy to be Elvis with it..erhh. or TSMC's 7nm core...still 14nm rocket lake beat easily all 7nm amd ryzens for gaming.


but....wait.. lets wait 10nm hydrib cpu, adler lake june 2021.
Posted on Reply
#8
londiste
The system was able to produce a Cinebench R23 score of insane 87878 points. With that many cores, it is no wonder how it is done, however, we need to look at how does it fare against the competition. For comparison, the Intel Xeon Platinum 8280L processor with its 28 cores and 56 threads that boost to 4.0 GHz can score up to 49,876 points.
What kind of comparison is that? 43% lower score with 100 cores less (almost 5x more cores)? :D
Besides, I bet that score is for 2x28 core Intel.
Posted on Reply
#9
roberto888
londiste
What kind of comparison is that? 43% lower score with 100 cores less (almost 5x more cores)? :D
Besides, I bet that score is for 2x28 core Intel.
For comparison, the Threadripper 3990X does 1262 SC / 75671 MC.
Posted on Reply
#11
londiste
roberto888
For comparison, the Threadripper 3990X does 1262 SC / 75671 MC.
Looks like these are way beyond the point of normal scaling for R23 :)
Posted on Reply
#13
zlobby
cueman
ok,right.

7nm cpu and 128- core

let see after year how many core intel make its own 7nm cpu.

amds is so easy to be Elvis with it..erhh. or TSMC's 7nm core...still 14nm rocket lake beat easily all 7nm amd ryzens for gaming.


but....wait.. lets wait 10nm hydrib cpu, adler lake june 2021.
I don't remember anyone forbidding intel to make their own 7nm or to use someone else's.
Posted on Reply
#14
TumbleGeorge
Punkenjoy
well that is in line with Amdahl Law

en.wikipedia.org/wiki/Amdahl%27s_law

multi-threading can't scale linear to infinite. at 256 thread, you start to be in the very low gain region.
We are too close to realize 6th generation of computers. Maybe neural training will reach the Amdalh low?
Posted on Reply
#15
wiak
with that many fast cores, meybe cinebench finishes before all the cores, gets to work?
it did that in some of ltt's epyc/theadripper videos
Posted on Reply
#16
Gruffalo.Soldier
I'm the only one
Intels 50k score with a shit ton less cores is not bad at all, so Amds score with 100 more cores is not that impressive in comparison. But Amd is king and Intel a turd on this forum now anyway.
Posted on Reply
#17
KarymidoN
geon2k2
This kind of tool is great indeed, however you need a version which is scalable for such high core count, otherwise you will get low cpu utilization and pretty useless results.

even the windows version can make a big difference too.
Posted on Reply
#18
deu
tigger
Intels 50k score with a shit ton less cores is not bad at all, so Amds score with 100 more cores is not that impressive in comparison. But Amd is king and Intel a turd on this forum now anyway.
If you dont see performance as a metric to be impressive, then no nothing is impressive...
Posted on Reply
#19
KarymidoN
tigger
Intels 50k score with a shit ton less cores is not bad at all, so Amds score with 100 more cores is not that impressive in comparison. But Amd is king and Intel a turd on this forum now anyway.
you have to understand the power consumption aspect. the Intel system with less cores was pulling more power from the wall than the AMD System with more cores and more performance, that matters a lot when u have a lot of servers running 24/7 with cooling and datacenter maintenance to pay.
Posted on Reply
#20
Gruffalo.Soldier
I'm the only one
KarymidoN
you have to understand the power consumption aspect. the Intel system with less cores was pulling more power from the wall than the AMD System with more cores and more performance, that matters a lot when u have a lot of servers running 24/7 with cooling and datacenter maintenance to pay.
OH i would have though when you're running 1000+ machines it doesn't really matter. but hey ho. AMD rules on tpu now. where did it mention power use?
Posted on Reply
#21
KarymidoN
tigger
OH i would have though when you're running 1000+ machines it doesn't really matter. but hey ho. AMD rules on tpu now. where did it mention power use?
what??? you're joking right? if a server pulls 500W from the wall and you have 100 servers thats 50.000 watts of power draw.
now imagine if you could change to EPYC servers and reduce those 500w to 450w. only 50w per server would mean 5.000 watts less, thats real money in your pocket.
Posted on Reply
#22
Gruffalo.Soldier
I'm the only one
KarymidoN
what??? you're joking right? if a server pulls 500W from the wall and you have 100 servers thats 50.000 watts of power draw.
now imagine if you could change to EPYC servers and reduce those 500w to 450w. only 50w per server would mean 5.000 watts less, thats real money in your pocket.
So they're using 45k watts instead of 50k celebration time
Posted on Reply
#24
KarymidoN
tigger
So they're using 45k watts instead of 50k celebration time
The point is, they're saving so much power that they could fit another 10 servers pulling 500w just with the power their saving
Posted on Reply
#25
R0H1T
tigger
So they're using 45k watts instead of 50k celebration time
The difference is actually much more in real life scenarios but you also need good software, heavily optimized in some cases, to make proper use of such high core counts. There's obviously ways getting around that like running multiple instances of that application, if you're working on same set of data then mostly impractical, but that's not something convenient on servers.
Posted on Reply
Add your own comment