• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Linux Performance of AMD Rome vs Intel Cascade Lake, 1 Year On

Raevenlord

News Editor
Staff member
Joined
Aug 12, 2016
Messages
3,676 (1.85/day)
Location
Portugal
System Name The Ryzening
Processor AMD Ryzen 9 5900X
Motherboard MSI X570 MAG TOMAHAWK
Cooling Lian Li Galahad 360mm AIO
Memory 32 GB G.Skill Trident Z F4-3733 (4x 8 GB)
Video Card(s) Gigabyte RTX 3070 Ti
Storage Boot: Transcend MTE220S 2TB, Kintson A2000 1TB, Seagate Firewolf Pro 14 TB
Display(s) Acer Nitro VG270UP (1440p 144 Hz IPS)
Case Lian Li O11DX Dynamic White
Audio Device(s) iFi Audio Zen DAC
Power Supply Seasonic Focus+ 750 W
Mouse Cooler Master Masterkeys Lite L
Keyboard Cooler Master Masterkeys Lite L
Software Windows 10 x64
Michael Larabel over at Phoronix posted an extremely comprehensive analysis on the performance differential between AMD's Rome-based EPYC and Intel's Cascade Lake Xeons one-year after release. The battery of tests, comprising more than 116 benchmark results, pits a Xeon Platinum 8280 2P system against an EPYC 7742 2P one. The tests were conducted pitting performance of both systems while running benchmarks under the Ubuntu 19.04 release, which was chosen as the "one year ago" baseline, against the newer Linux software stack (Ubuntu 20.10 daily + GCC 10 + Linux 5.8).

The benchmark conclusions are interesting. For one, Intel gained more ground than AMD over the course of the year, with the Xeon platform gaining 6% performance across releases, while AMD's EPYC gained just 4% over the same period of time. This means that AMD's system is still an average of 14% faster across all tests than the Intel platform, however, which speaks to AMD's silicon superiority. Check some benchmark results below, but follow the source link for the full rundown.



View at TechPowerUp Main Site
 
Joined
Mar 21, 2016
Messages
1,289 (0.61/day)
Hasn't traditionally Intel had better compiler support on the software side or at least more widely used. This seems to be what I'd expect though there is only so much extra leeway they'll be able to gain from just a compiler advantage alone.
 
Last edited:
Joined
Jun 19, 2010
Messages
330 (0.08/day)
Location
North-Rhine-Westphalia
Processor Ryzen 2700 (0.819 VSoC)
Motherboard B450
Cooling Thermalright ARO
Memory 2x 8GB DDR4-3000 CL16 XMP
Video Card(s) GTX 1650 boost:1859 vram:2500
Storage 1TB PCIe4.0 NVMe Samsung PM9A1
Display(s) 4K UHD 40" HDR TV
Case Sharkoon AM5 Window red
Audio Device(s) Headset
Power Supply beQuiet PurePower10 400W
Software Win10
When compared intel to AMD over the years, the difference in a scenario where the AMD gets pitted against an intel getting good code while AMD getting inferior code out of it, yeah the offset is bigger then.
Intels marketing and intel-tame software/hardware companies try to fool people wich don´t have their glasses as clean as they should.
Phoronix Michael Larabel does a great job everytime going as real as possible with benchmarks.
 
Joined
Mar 18, 2008
Messages
5,712 (1.13/day)
System Name Virtual Reality / Bioinformatics
Processor Undead CPU
Motherboard Undead TUF X99
Cooling Noctua NH-D15
Memory GSkill 128GB DDR4-3000
Video Card(s) EVGA RTX 3090 FTW3 Ultra
Storage Samsung 960 Pro 1TB + 860 EVO 2TB + WD Black 5TB
Display(s) 32'' 4K Dell
Case Fractal Design R5
Audio Device(s) BOSE 2.0
Power Supply Seasonic 850watt
Mouse Logitech Master MX
Keyboard Corsair K70 Cherry MX Blue
VR HMD HTC Vive + Oculus Quest 2
Software Windows 10 P
For programs the can leverage AVX512, Intel chip still reign supreme.
 
Joined
Feb 21, 2006
Messages
1,327 (0.23/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 5800X
Motherboard Asus Prime X570-Pro BIOS 4021 AM4 AGESA V2 PI 1.2.0.3 Patch C
Cooling Corsair H150i Pro
Memory 32GB Gskill Trident RGB DDR4-3200 14-14-14-34-1T
Video Card(s) AMD Radeon RX 6800 XT 16GB
Storage Corsair MP600 1TB PCIe 4 / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 14TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB
Keyboard Logitech G810
Software Windows 10 Pro x64 21H2
Benchmark Scores 3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c
For programs the can leverage AVX512, Intel chip still reign supreme.

like the 10 pieces of software that actually use AVX512 sure :)
 
Joined
Oct 2, 2015
Messages
2,820 (1.23/day)
Location
Argentina
System Name Ciel / Yukino
Processor AMD Ryzen R5 5600X / Intel Core i3 5005U
Motherboard Asus Tuf Gaming B550 Plus / HP 240 G5
Cooling ID-Cooling 224-XT Basic
Memory 2x 8GB Geil Orion AMD Edition 3600MHz@3800MHz
Video Card(s) Dell 1660 SUPER + Sentey RX 550 2GB
Storage SSD ADATA FALCON 512GB PCIe3.0 + HDD WD 4TB
Display(s) Samsung S22F350
Case Cougar MX410 Mesh-G
Audio Device(s) Realtek ALC S1200A
Power Supply Aerocool KCAS-500W
Mouse Logitech G203
Keyboard VSG Alnilam
Software Windows 10 x64 / Manjaro x64
I can't wait for AMD to finally standardize AVX-512, it seems Intel needs a decade to do so.
 
Joined
Mar 23, 2016
Messages
4,830 (2.27/day)
Processor Ryzen 7 3800X
Motherboard MSI B450 Tomahawk ATX
Cooling Cooler Master Hyper 212 Black Edition
Memory PNY Anarchy-X XLR8 Red DDR4-3200 16GB kit & PNY Anarchy-X XLR8 Red DDR4-2666 16GB kit
Video Card(s) MSI GeForce RTX 2060 GAMING Z 6G
Storage Samsung 970 EVO NVMe M.2 500 GB, Samsung 870 QVO 1 TB
Display(s) Samsung SyncMaster C27H711
Case Phantek Eclipse P400S (PH-EC416PS)
Audio Device(s) EVGA NU Audio
Power Supply EVGA 850 BQ
Mouse SteelSeries Rival 310
Keyboard Logitech G G413 Silver
Software Windows 10 Professional 64-bit v21H1
I can't wait for AMD to finally standardize AVX-512, it seems Intel needs a decade to do so.
Yeah, Intel is all over place with their product segmentation strategy across all the CPU categories.
 

tabascosauz

Moderator
Supporter
Staff member
Joined
Jun 24, 2015
Messages
3,696 (1.54/day)
Location
Western Canada
Processor R9 5900X
Motherboard ROG X570 Impact (3601)
Cooling Optimus Foundation AM4 - Bitspower Brizo VGA FE - 280mm XT45 v.2 - DDC4.2
Memory 2x16GB 3800CL15 1.42V
Video Card(s) RTX 2060 Super FE (2050/1975 1.00V)
Case Cerberus X
Keyboard RAMA U80-A SEQ2 / Noxary X60 R
I can't wait for AMD to finally standardize AVX-512, it seems Intel needs a decade to do so.

They won't, because AVX-512 exists for Intel, who wants to push its products in specific areas like AI. Instead of actually standardizing the entire instruction family, they just pull out single instructions under the AVX-512 banner whenever the marketing team needs it, eg. VNNI when Intel needs to market itself to deep learning.

Take a look at the horrendously fragmented list of products supporting scattered bits and pieces of AVX-512 and you'll see why it's not even remotely worth AMD's time right now.
 
Joined
Aug 20, 2007
Messages
16,821 (3.19/day)
System Name Pioneer
Processor Ryzen R9 5950X
Motherboard GIGABYTE X570 Aorus Elite
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory G.SKILL TridentZ 32GB (4 x 8GB) @ DDR4-3600 (Samsung B-Die)
Video Card(s) EVGA GeForce RTX 3070 FTW3 (non-LHR)
Storage 2x Mushkin Pilot-E 2TB PCIe 3.0 NVMe SSDs
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) VGA HDMI->LG B9 OLED/Schiit Modi MB/Asgard 2 DAC/Amp to AKG Pro K712 Headphones
Power Supply EVGA SuperNova T2 Titanium 850W
Mouse Razer Deathadder v2
Keyboard WASD CODE v3 w/ Cherry Green Keyswitches & aftermarket PBT keycaps
Software Windows 11 Enterprise (yes, it's legit)
Hasn't traditionally Intel had better compiler support on the software side or at least more widely used. This seems to be what I'd expect though there is only so much extra leeway they'll be able to gain from just a compiler advantage alone.

Michael Larabel deals in linux. There is no compiler advantage there. Heck, I don't even think anyone there USES ICC... I know you can't even build the kernel with it, for starters.
 
Joined
Oct 2, 2015
Messages
2,820 (1.23/day)
Location
Argentina
System Name Ciel / Yukino
Processor AMD Ryzen R5 5600X / Intel Core i3 5005U
Motherboard Asus Tuf Gaming B550 Plus / HP 240 G5
Cooling ID-Cooling 224-XT Basic
Memory 2x 8GB Geil Orion AMD Edition 3600MHz@3800MHz
Video Card(s) Dell 1660 SUPER + Sentey RX 550 2GB
Storage SSD ADATA FALCON 512GB PCIe3.0 + HDD WD 4TB
Display(s) Samsung S22F350
Case Cougar MX410 Mesh-G
Audio Device(s) Realtek ALC S1200A
Power Supply Aerocool KCAS-500W
Mouse Logitech G203
Keyboard VSG Alnilam
Software Windows 10 x64 / Manjaro x64
They won't, because AVX-512 exists for Intel, who wants to push its products in specific areas like AI. Instead of actually standardizing the entire instruction family, they just pull out single instructions under the AVX-512 banner whenever the marketing team needs it, eg. VNNI when Intel needs to market itself to deep learning.

Take a look at the horrendously fragmented list of products supporting scattered bits and pieces of AVX-512 and you'll see why it's not even remotely worth AMD's time right now.
Sales numbers can and will change that. They are still in 14nm hell, and seems like they will still be for at least another 6 months. Stupid decisions like these are costing them their credibility.
 
Joined
Mar 23, 2016
Messages
4,830 (2.27/day)
Processor Ryzen 7 3800X
Motherboard MSI B450 Tomahawk ATX
Cooling Cooler Master Hyper 212 Black Edition
Memory PNY Anarchy-X XLR8 Red DDR4-3200 16GB kit & PNY Anarchy-X XLR8 Red DDR4-2666 16GB kit
Video Card(s) MSI GeForce RTX 2060 GAMING Z 6G
Storage Samsung 970 EVO NVMe M.2 500 GB, Samsung 870 QVO 1 TB
Display(s) Samsung SyncMaster C27H711
Case Phantek Eclipse P400S (PH-EC416PS)
Audio Device(s) EVGA NU Audio
Power Supply EVGA 850 BQ
Mouse SteelSeries Rival 310
Keyboard Logitech G G413 Silver
Software Windows 10 Professional 64-bit v21H1
Have any of the Celeron’s picked up HyperThreading or are they still limited to dual-cores without HT? With Comet Lake they should be two cores four threads.
 
Joined
Jan 2, 2018
Messages
188 (0.13/day)
The C++ application i am programming takes 35 seconds to compile using 8 threads on 9900K (stock)
And it takes 18 seconds to compile using 8 threads on 3700X (stock)

Now thats what i call a productive CPU



And it takes 25 minutes to compile using 2 threads on Allwiner A20 ARM cpu lol
 
Joined
Feb 3, 2017
Messages
3,273 (1.81/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
For these over 100 tests run, the AMD EPYC 7742 2P on the latest Linux software packages yielded 14% better performance over Intel's top-end non-AP Xeon Platinum 8280 dual socket server.
What is kind of weird is the only 14% difference in geomean of test results. I guess there are just too many tests that do not rely on many threads. The systems are 128 vs 56 cores, after all.
 
Joined
Jan 6, 2013
Messages
306 (0.09/day)
So, 128 cores AMD vs 56 cores Intel and AMD wins by 14%????
LE: Now I see it. The tests are a mix of ST and lightly/hard MT scenarios. In any case, with very well MT software you'll see bigger difference, but I guess given these are the current workloads in the server space, Intel is not that far off.
 
Joined
Jun 10, 2014
Messages
2,553 (0.92/day)
Processor AMD Ryzen 9 5900X ``| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ``| Asus P9X79 WS
Cooling Noctua NH-U14S ``| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ``| Corsair 8 x 8 GB 1600 MHz
Video Card(s) MSI GTX 1060 3GB ``| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ``| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ``| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
Hasn't traditionally Intel had better compiler support on the software side or at least more widely used. This seems to be what I'd expect though there is only so much extra leeway they'll be able to gain from just a compiler advantage alone.
This is only a myth.
Pretty much all software today is compiled with eiter GCC, LLVM or MSVC, neither are biased.
Of all the compiler optimization that GCC and LLVM offers, most of them are generic. There are a few exceptions, like if you target Zen 2 vs. Skylake, but those are minimal and the majority of optimizations are all the same.

We can't optimize for the underlying microarchitectures, as the CPUs share a common ISA. The CPUs from Intel and AMD also behave very similarly, so in order to optimize significantly for either one, we need some significant ISA differences. As of right now, Skylake and Zen 2 is very comparable in ISA features (while Skylake-X and Ice Lake have some new features like AVX-512 and a few other instructions). So when the ISA and general behavior is the same, the possibility of targeted optimizations to favor one of them is pretty much non-existent. So whenever you hear people claim games are "Skylake optimized" etc., that's 100% BS, they have no idea what they're talking about.

They won't, because AVX-512 exists for Intel, who wants to push its products in specific areas like AI. Instead of actually standardizing the entire instruction family, they just pull out single instructions under the AVX-512 banner whenever the marketing team needs it, eg. VNNI when Intel needs to market itself to deep learning.
You are clearly way off base here.
The core functionality of AVX-512 is known as AVX-512F, the others are optional extensions.
The various "AI" features are marketed as AVX-512 because they use the AVX-512 vector units, unlike other single instructions running through the integer units.

As an additional note;
I'm not a fan of application specific instructions. Those never get widespread use, and quickly become obsolete, and software relying on these are no longer forward-compatible.
 
Joined
Jul 19, 2017
Messages
70 (0.04/day)
This is only a myth.
...
So whenever you hear people claim games are "Skylake optimized" etc., that's 100% BS, they have no idea what they're talking about.
Isn't that what's so usual, people (myself included) carry with us plenty of old and irrelevant information/data, mainly because it's almost impossible to keep being updated with it all?
 
Last edited:
Joined
Feb 19, 2009
Messages
1,129 (0.24/day)
Location
I live in Norway
System Name 3 sys spec seperated by "|"
Processor R9 3900x| R7 1700 @3.75 | 4800H
Motherboard Asrock X570M | AB350M Pro 4 | Asus Tuf A15
Cooling Air | Air | duh laptop
Memory 64gb G.skill SniperX @3600 CL16 | 64GB | 32GB
Video Card(s) XFX RX 6800 Speedster |V64\Quadro P4000 | RTX2060M
Storage MP510 2TB, 660P 2TB, 2x860 evo 1tb | 960 500gb Intel 660P 1tb PM871 4x256gb ++| 1TB 660+ 1tb A1000
Display(s) AOC 28" 4K something + 1440p AOC 144hz something.
Case Phanteks EvolvX M-Atx
Power Supply Corsair RM850
Mouse g502 Lightspeed
Keyboard G915
Software win10,unraid,Manjaro
Benchmark Scores 30000FS, 16300 TS. Lappy, 7000 TS.
So, 128 cores AMD vs 56 cores Intel and AMD wins by 14%????
LE: Now I see it. The tests are a mix of ST and lightly/hard MT scenarios. In any case, with very well MT software you'll see bigger difference, but I guess given these are the current workloads in the server space, Intel is not that far off.

In datacenter loads mostly mt rules, cause you don't run one application on a server.
You run virtualized, docker, yeah..
 
Joined
Oct 31, 2013
Messages
167 (0.06/day)
For modern supercomputers and AI, don't you just use a GPU for highly parellelized stuff like AVX 512?

AMD had the HSA stuff, but it never got adopted with the APUs.
 
Joined
Jun 10, 2014
Messages
2,553 (0.92/day)
Processor AMD Ryzen 9 5900X ``| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ``| Asus P9X79 WS
Cooling Noctua NH-U14S ``| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ``| Corsair 8 x 8 GB 1600 MHz
Video Card(s) MSI GTX 1060 3GB ``| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ``| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ``| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
For modern supercomputers and AI, don't you just use a GPU for highly parellelized stuff like AVX 512?

AMD had the HSA stuff, but it never got adopted with the APUs.
You raise a very valid question which many in here might be wondering, and there is an explanation.
AVX, multithreading and GPU acceleration are all different types of parallelism, but they work on different scopes.
  • AVX works mixed in with other instructions, and have a negligible overhead cost. AVX is primarily parallelization on data level, not logic level, which means repeated logic can be eliminated. One AVX operation costs the same as a single FP operation, so with AVX-512 you can do 16 32-bit floats at the same cost of a single float. And the only "cost" is the normal transfer between CPU registers. So this is parallelization on the finest level, typically a few lines of code or inside a loop.
  • Multithreading is on a coarser level than AVX. When using multiple threads, there are much higher synchronization costs, ranging from sending simple signals to sending larger pieces of data. Also data hazards can very quickly lead to stalls and inefficiency, so for this reason the proper way to scale with threads is to divide the workload into independent work chunks given to each worker threads. Multiple threads also have to deal with the OS scheduler which can cause latencies of several ms. Work chunks for threads are generally ranging from ms to seconds, while AVX works in the nanosecond range.
  • GPU acceleration have even larger synchronization costs than multithreading, but the GPU has also more computational power, so if the balance is right, GPU acceleration makes sense. The GPU is very good at computational density, while current GPUs still relies on the CPU to control the workflow on a higher level.
It's worth mentioning that many productive applications use two or all three types of parallelization, as they complement each other.

But when it comes to "AI" for supercomputers, this will soon be accelerated by ASICs. I see no reason why general purpose CPUs should include such features.
 
Last edited:
Joined
Feb 3, 2017
Messages
3,273 (1.81/day)
Processor R5 5600X
Motherboard ASUS ROG STRIX B550-I GAMING
Cooling Alpenföhn Black Ridge
Memory 2*16GB DDR4-2666 VLP @3800
Video Card(s) EVGA Geforce RTX 3080 XC3
Storage 1TB Samsung 970 Pro, 2TB Intel 660p
Display(s) ASUS PG279Q, Eizo EV2736W
Case Dan Cases A4-SFX
Power Supply Corsair SF600
Mouse Corsair Ironclaw Wireless RGB
Keyboard Corsair K60
VR HMD HTC Vive
Joined
Jun 10, 2014
Messages
2,553 (0.92/day)
Processor AMD Ryzen 9 5900X ``| Intel Core i7-3930K
Motherboard ASUS ProArt B550-CREATOR ``| Asus P9X79 WS
Cooling Noctua NH-U14S ``| Be Quiet Pure Rock
Memory Crucial 2 x 16 GB 3200 MHz ``| Corsair 8 x 8 GB 1600 MHz
Video Card(s) MSI GTX 1060 3GB ``| MSI GTX 680 4GB
Storage Samsung 970 PRO 512 GB + 1 TB ``| Intel 545s 512 GB + 256 GB
Display(s) Asus ROG Swift PG278QR 27" ``| Eizo EV2416W 24"
Case Fractal Design Define 7 XL x 2
Audio Device(s) Cambridge Audio DacMagic Plus
Power Supply Seasonic Focus PX-850 x 2
Mouse Razer Abyssus
Keyboard CM Storm QuickFire XT
Software Ubuntu
I know of one person that doesn't like avx-512 very much

We already have a discussion about that, you are welcome to join it here: https://www.techpowerup.com/forums/...mmick-to-invent-and-win-at-benchmarks.269770/

- He dislikes FP in general. This may or may not be a reasonable stance.
FP is used a lot, in video, rendering, photo editing, games etc.
And AVX can do integer too, which is why I often refer to them as vector units, since they can do both integers and floats. Integers in AVX is used heavily in things like file compression.
 
Joined
Jan 8, 2017
Messages
7,076 (3.85/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Deepcool Gammaxx L240 V2
Memory 16GB - Corsair Vengeance LPX - 3333 Mhz CL16
Video Card(s) OEM Dell GTX 1080 with Kraken G12 + Water 3.0 Performer C
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Deepcool Matrexx 70
Power Supply GPS-750C
I can't wait for AMD to finally standardize AVX-512, it seems Intel needs a decade to do so.

I hope not, very wide SIMD is a fallacy in modern computer architecture design. SIMD was introduced in the days when other massively parallel compute hardware didn't exist and everyone thought frequency/numbers of transistors would just scale forever with increasingly lower power consumption. That didn't hold up, the contingency created by simultaneously trying to make a CPU that has the fastest possible single core performance while trying to add more and more cores and wider SIMD is too great. GPUs make CPU SIMD redundant, I can't think of a single application that couldn't be scaled up from x86 AVX to CUDA/OpenCL, in fact the latter are way more robust anyway.
 
Top