• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

MATLAB MKL Codepath Tweak Boosts AMD Ryzen MKL Performance Significantly

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
37,793 (8.50/day)
Location
Hyderabad, India
Processor AMD Ryzen 7 2700X
Motherboard ASUS ROG Strix B450-E Gaming
Cooling AMD Wraith Prism
Memory 2x 16GB Corsair Vengeance LPX DDR4-3000
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) Samsung U28D590 28-inch 4K UHD
Case Corsair Carbide 100R
Audio Device(s) Creative Sound Blaster Recon3D PCIe
Power Supply Cooler Master MWE Gold 650W
Mouse Razer Abyssus
Keyboard Microsoft Sidewinder X4
Software Windows 10 Pro
MATLAB is a popular math computing environment in use by engineering firms, universities, and other research institutes. Some of its operations can be made to leverage Intel MKL (Math Kernel Library), which is poorly optimized for, and notoriously slow on AMD Ryzen processors. Reddit user Nedflanders1976 devised a way to restore anywhere between 20 to 300 percent performance on Ryzen and Ryzen Threadripper processors, by forcing MATLAB to use advanced instruction-sets such as AVX2. By default, MKL queries your processor's vendor ID string, and if it sees anything other than "GenuineIntel...," it falls back to SSE, posing a significant performance disadvantage to "AuthenticAMD" Ryzen processors that have a full IA SSE4, AVX, and AVX2 implementation.

The tweak, meant to be manually applied by AMD Ryzen users, forces MKL to use AVX2 regardless of the CPU Vendor ID query result. The tweak is as simple as it is powerful. A simple 4-line Windows batch file with a set of arguments starts MKL in AVX2 mode. You can also make the tweak "permanent" by creating a system environment variable. The environment variable will apply to all instances of MATLAB, and not just those spawned by the batch file. Nedflanders1976 also posted a benchmark script that highlights the performance impact of AVX2, however you can use your own scripts and post results.



View at TechPowerUp Main Site
 
Joined
Jan 8, 2017
Messages
4,608 (4.31/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Scythe Katana 4 - 3x 120mm case fans
Memory 16GB - Corsair Vengeance LPX
Video Card(s) OEM Dell GTX 1080
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Zalman R1
Power Supply 500W
Both İnyle and MATLAB developers should punished for this
Good luck with that, the FTC settlement about Intel's compiler tricks is perhaps one of the stupidest thing I ever read. It basically forces Intel to disclose that their compilers are biased only to the developer using it and not to the end user. In other words, it's completely worthless, Intel is still free to do whatever they want and distribute software that intentionally cripples performance on the user's end.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
19,880 (3.49/day)
Processor Core i7-4790K
Memory 16 GB
Video Card(s) GTX 1080
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 7
Anyone using Matlab here? Would love to get some real-life scenario data for my CPU reviews
 
Joined
Jan 8, 2017
Messages
4,608 (4.31/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Scythe Katana 4 - 3x 120mm case fans
Memory 16GB - Corsair Vengeance LPX
Video Card(s) OEM Dell GTX 1080
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Zalman R1
Power Supply 500W
Anyone using Matlab here? Would love to get some real-life scenario data for my CPU reviews
I do, not extensively though.
 
Joined
Aug 20, 2007
Messages
12,099 (2.69/day)
System Name Pioneer
Processor Intel i9 9900k
Motherboard ASRock Z390 Taichi
Cooling Noctua NH-D15 + A whole lotta Sunon and Corsair Maglev blower fans...
Memory G.SKILL TridentZ Series 32GB (4 x 8GB) DDR4-3200 @ 13-13-13-33-2T
Video Card(s) EVGA GTX 1080 FTW2
Storage Mushkin Pilot-E 2TB NVMe SSD w/ EKWB M.2 Heatsink
Display(s) LG 32GK850G-B 1440p 32" AMVA Panel G-Sync 144hz Display
Case Thermaltake Core X31
Audio Device(s) Onboard TOSLINK to Schiit Modi MB to Schiit Asgard 2 Amp to AKG K7XX Ruby Red Massdrop Headphones
Power Supply EVGA SuperNova T2 850W 80Plus Titanium
Mouse ROCCAT Kone EMP
Keyboard WASD CODE 104-Key w/ Cherry MX Green Keyswitches, Doubleshot Vortex PBT White Transluscent Keycaps
Software Windows 10 x64 Enterprise... yes, it's legit.
Is matlab even built with the Intel compiler, or is this just dumb programming?

I'm suspecting the ICC was used but can't rule out dev stupidity either...
 
Joined
May 31, 2016
Messages
1,347 (1.04/day)
System Name Bro2
Processor Ryzen 2700X
Motherboard MSI X470 Gaming Carbon
Cooling Corsair h115i pro rgb
Memory G.Skill Flare X 3200 CL14
Video Card(s) RX Vega 64 Red Devil
Storage M.2 Samsung Evo 970 250MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD
Case Fractal Design G
Audio Device(s) realtec 5.1
Power Supply Corsair AXi 760W
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
Good luck with that, the FTC settlement about Intel's compiler tricks is perhaps one of the stupidest thing I ever read. It basically forces Intel to disclose that their compilers are biased only to the developer using it and not to the end user. In other words, it's completely worthless, Intel is still free to do whatever they want and distribute software that intentionally cripples performance on the user's end.
Normally, I always agree to what you write but are you sure about this? Sooner or later this subject will be brought to light. In an era where Intel was superior in performance that maybe would have been let go but now things are changing. It is Karma and sooner or later Intel will have to answer to scams and tricks.
 
Joined
Jun 28, 2016
Messages
2,875 (2.28/day)
Why do we even get Matlab news here? WTF?

As for MKL - it's used by a lot of computing software. Why? Becase it makes stuff run faster on Intel CPUs. Why would it not be used? This is how computing works. Intel has given developers an API to speed up their programs. Why is Intel attacked for this on this forum? It should be praised.

AMD is also allowed to offer an API optimized for Zen. And I'm sure software developers will gladly implement it as AMD CPUs gain popularity.

For a decade there was really no reason to optimize software for AMD.
 
Joined
Mar 8, 2019
Messages
6 (0.02/day)
Location
Italy, the land of madness
System Name The ugly cube
Processor i7 4770k 4.20 GHz
Motherboard Asrock Z97 Xtreme 4
Cooling Thermalright Macho Rev.B
Memory 4 x 4Gb G.Skill RipJawZ 2133 Mhz DDR3 F3-2133C10Q-16GZM
Video Card(s) SAPPHIRE NITRO+ Radeon™ RX 480 8G D5 OC
Storage System: HDD 1Tb seagate Barracuda + Gaming: Crucial MX500 512Gb
Display(s) i don't remember
Case Cooler Master HAF XB EVO
Audio Device(s) Integrated soundcard, DIY gainclone amplifier, speaker Sonus Faber Minima (badly aged, to repair)
Power Supply EVGA SuperNOVA 650 G3
Mouse Logitech G402
Keyboard Ozone Strike Pro
Software Windows 7 Pro 64 Bit,
Benchmark Scores ------ To do D:
@notb it's have been appured that it's not the case of intel speeding up their processors, but crippling the competition artificially (it was like, there was a check for the vendor ID [intel, amd, via] of the cpu). I remember an old article in witch there was show proof of this,but i can't find the link, someone have it? it was really interesting to read.

Edit: i found the article!! here it is: https://www.agner.org/optimize/blog/read.php?i=49#49
 
Joined
May 31, 2016
Messages
1,347 (1.04/day)
System Name Bro2
Processor Ryzen 2700X
Motherboard MSI X470 Gaming Carbon
Cooling Corsair h115i pro rgb
Memory G.Skill Flare X 3200 CL14
Video Card(s) RX Vega 64 Red Devil
Storage M.2 Samsung Evo 970 250MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD
Case Fractal Design G
Audio Device(s) realtec 5.1
Power Supply Corsair AXi 760W
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
Why do we even get Matlab news here? WTF?

As for MKL - it's used by a lot of computing software. Why? Becase it makes stuff run faster on Intel CPUs. Why would it not be used? This is how computing works. Intel has given developers an API to speed up their programs. Why is Intel attacked for this on this forum? It should be praised.

AMD is also allowed to offer an API optimized for Zen. And I'm sure software developers will gladly implement it as AMD CPUs gain popularity.

For a decade there was really no reason to optimize software for AMD.
Crippling other companies' products is not speeding your product up although it looks better in comparison.
The article OP is referring to is proving that you can work around the crippling procedure Intel has implemented with AMD processors.
 
Joined
Jul 4, 2018
Messages
169 (0.32/day)
Why do we even get Matlab news here? WTF?

As for MKL - it's used by a lot of computing software. Why? Becase it makes stuff run faster on Intel CPUs. Why would it not be used? This is how computing works. Intel has given developers an API to speed up their programs. Why is Intel attacked for this on this forum? It should be praised.

AMD is also allowed to offer an API optimized for Zen. And I'm sure software developers will gladly implement it as AMD CPUs gain popularity.

For a decade there was really no reason to optimize software for AMD.
The fact (or problem) is that as demonstrated by this article you don't actually need an AMD provided API to achieve better performance.
 
Joined
Jan 8, 2017
Messages
4,608 (4.31/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Scythe Katana 4 - 3x 120mm case fans
Memory 16GB - Corsair Vengeance LPX
Video Card(s) OEM Dell GTX 1080
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Zalman R1
Power Supply 500W
Sooner or later this subject will be brought to light.
But that's the thing, it was brought to attention.


"disclose to software developers that Intel computer compilers discriminate between Intel chips and non-Intel chips, and that they may not register all the features of non-Intel chips. Intel also will have to reimburse all software vendors who want to recompile their software using a non-Intel compiler. "

Aka, "carry on".

I wouldn't even call it a slap on the wrist, that'd be too much. This is all too well documented and went on for dozens of years at this point so, yes, I would say nothing is ever going to change this. Technically this was already settled and no one is going to go back to it.
 
Joined
Feb 11, 2009
Messages
2,290 (0.58/day)
System Name Cyberline
Processor Intel Core i7 2600k
Motherboard Asus P8P67 LE Rev 3.0
Cooling Tuniq Tower 120
Memory Corsair (4x2) 8gb 1600mhz
Video Card(s) AMD RX480
Storage Samsung 750 Evo 250gb SSD + WD 1tb x 2 + WD 2tb
Display(s) Philips 32inch LPF5605H (television)
Case antec 600
Audio Device(s) Focusrite 2i4 (USB)
Power Supply Seasonic 620watt 80+ Platinum
Mouse Elecom EX-G
Keyboard Rapoo V700
Software Windows 10 Pro 64bit
95% of the market in your hands, still turn to this kind of stuff, for shame Intel, pathetic.
 
Joined
May 31, 2016
Messages
1,347 (1.04/day)
System Name Bro2
Processor Ryzen 2700X
Motherboard MSI X470 Gaming Carbon
Cooling Corsair h115i pro rgb
Memory G.Skill Flare X 3200 CL14
Video Card(s) RX Vega 64 Red Devil
Storage M.2 Samsung Evo 970 250MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD
Case Fractal Design G
Audio Device(s) realtec 5.1
Power Supply Corsair AXi 760W
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
But that's the thing, it was brought to attention.


"disclose to software developers that Intel computer compilers discriminate between Intel chips and non-Intel chips, and that they may not register all the features of non-Intel chips. Intel also will have to reimburse all software vendors who want to recompile their software using a non-Intel compiler. "

Aka, "carry on".

I wouldn't even call it a slap on the wrist, that'd be too much. This is all too well documented and went on for dozens of years at this point so, yes, I would say nothing is ever going to change this. Technically this was already settled and no one is going to go back to it.
Well I'm a believer so .... :)
 
Joined
Jul 16, 2014
Messages
3,223 (1.63/day)
Location
SE Michigan
System Name Dumbass
Processor AMD-9370BE @4.6
Motherboard ASUS SABERTOOTH 990FX R2.0 +SB950
Cooling CM Nepton 280L
Memory G.Skill Sniper 16gb DDR3 2400
Video Card(s) GreenTeam 1080 Gaming X 8GB
Storage C:\SSD (240GB), D:\Seagate (2TB), E:\Western Digital (1TB)
Display(s) 1x Nixeus NX_EDG27, 2x Dell S2440L (16:9)
Case Phanteks Enthoo Primo w/8 140mm SP Fans
Audio Device(s) onboard (realtek?) SPKRS:Logitech Z623 200w 2.1
Power Supply Corsair HX1000i
Mouse Logitech G700s
Keyboard Logitech G910 Orion Spark
Software windows 10
Benchmark Scores https://i.imgur.com/aoz3vWY.jpg?2
its taken 10 years, or so, and thanks to a new instruction set, developers can crawl out from under Intels thumb. What tthis shows is that this, whatever you wanna call it, work around, etc. can now be used to tweak other "Intel Only" software.
 
Joined
Aug 23, 2013
Messages
114 (0.05/day)
Stuff like this is going to be a problem for a long time. AMD doesn't have as much resources to spend on software optimization as Intel has. Intel is using that as to way to slow down AMD optimization, by paying for optimization time for their stuff, so the devs will not have time to work on optimizing for AMD.
 
Joined
May 31, 2016
Messages
1,347 (1.04/day)
System Name Bro2
Processor Ryzen 2700X
Motherboard MSI X470 Gaming Carbon
Cooling Corsair h115i pro rgb
Memory G.Skill Flare X 3200 CL14
Video Card(s) RX Vega 64 Red Devil
Storage M.2 Samsung Evo 970 250MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD
Case Fractal Design G
Audio Device(s) realtec 5.1
Power Supply Corsair AXi 760W
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
Stuff like this is going to be a problem for a long time. AMD doesn't have as much resources to spend on software optimization as Intel has. Intel is using that as to way to slow down AMD optimization, by paying for optimization time for their stuff, so the devs will not have time to work on optimizing for AMD.
You have slightly missed the point. Intel didn't spend money to optimize Intel's CPUs but to make competitive companies processors use different code path to cripple their performance. What it means is, if you are Intel you go the faster way (which competition can go as well but it is exclusive) if you are not you will be stuck with the code that is slow as molasses.
 
Joined
Jun 4, 2004
Messages
419 (0.07/day)
System Name Octopussy
Processor 1x Intel Core-i7 3930K @ 4.8GHz
Motherboard ASUS Rampage IV Extreme
Cooling Full water cooling, mostly Aqua Computer and EKWB stuff!
Memory 4x 4GB Corsair Dominator GT DDR3-2133 9-11-10-27-1T
Video Card(s) 2x Gigabyte GTX 980Ti G1 Gaming in SLI
Storage 8x Hitachi 5K3000 3TB RAID-6 (Adaptec 5805); 1x 512GB Samsung 950Pro (Angelbird Wings PX1)
Display(s) PLP: Dell 2007FBb 20" TFT + Dell U3011 30" TFT + Dell 2007FBb 20" TFT
Case CaseLabs TH10A
Audio Device(s) ASUS Xonar Phoebus
Power Supply SeaSonic SS-1000XP
Mouse Logitech MX Master
Keyboard SteelSeries Apex M800
Software MS Windows 10 Pro x64
Benchmark Scores A lot.
I'm using a lot of MATLAB at work and we are in the process of migrating most of the newer code over to Python now using numpy. Reading this I wonder if there are similiar gains to be had for the MKL version of numpy? Hmmm...

Anyhow, this environment variable seems like something Intel has build into the MKL for debugging purposes. If that is making such a difference, I'm sure Intel will "fix" this in future release of the MKL! :kookoo:
 
Joined
Jun 29, 2018
Messages
41 (0.08/day)
I'm using a lot of MATLAB at work and we are in the process of migrating most of the newer code over to Python now using numpy. Reading this I wonder if there are similiar gains to be had for the MKL version of numpy? Hmmm...
Yes, Anaconda has mkl and numpy/scipy support integrated.
 
Joined
Jun 4, 2004
Messages
419 (0.07/day)
System Name Octopussy
Processor 1x Intel Core-i7 3930K @ 4.8GHz
Motherboard ASUS Rampage IV Extreme
Cooling Full water cooling, mostly Aqua Computer and EKWB stuff!
Memory 4x 4GB Corsair Dominator GT DDR3-2133 9-11-10-27-1T
Video Card(s) 2x Gigabyte GTX 980Ti G1 Gaming in SLI
Storage 8x Hitachi 5K3000 3TB RAID-6 (Adaptec 5805); 1x 512GB Samsung 950Pro (Angelbird Wings PX1)
Display(s) PLP: Dell 2007FBb 20" TFT + Dell U3011 30" TFT + Dell 2007FBb 20" TFT
Case CaseLabs TH10A
Audio Device(s) ASUS Xonar Phoebus
Power Supply SeaSonic SS-1000XP
Mouse Logitech MX Master
Keyboard SteelSeries Apex M800
Software MS Windows 10 Pro x64
Benchmark Scores A lot.
Yes, Anaconda has mkl and numpy/scipy support integrated.
Unfortunately, I don't have an AMD system here to test something on. :ohwell:
 
Joined
Jun 29, 2018
Messages
41 (0.08/day)
Unfortunately, I don't have an AMD system here to test something on. :ohwell:
I meant the difference between standard numpy and MKL-powered one. Depending on the operation type the difference can be huge.
 
Joined
Apr 30, 2011
Messages
1,463 (0.46/day)
Location
Greece
Processor AMD FX-8350 4GHz@1.3V
Motherboard Gigabyte GA-970A UD3 Rev3.0
Cooling Zalman CNPS9X Optima
Memory 4*4GB DDR3 1600MHz CL9
Video Card(s) Sapphire Radeon RX 5700 Pulse 8GB
Storage Sandisk SSD 120GB, Samsung F1 1TB, Hitachi HUS724040ALE640 4TB
Display(s) LG IPS235
Case Zalman Neo Z9 Black
Audio Device(s) Via 7.1 onboard
Power Supply Be Quiet Pure Power 11 600W
Mouse Sharkoon SHARK Force Black
Keyboard Trust GXT280
Software Win 7 sp1 64bit/Win 10 pro 64bit
Benchmark Scores CB R15 64bit: single core 99p, multicore 647p WPrime 1.55 (8 cores): 9.0 secs
A small and free history lesson for anyone failing or not willing to understand how we ended up here with the CPU market.

For over a decade now, Intel triy to bribe any dev or OEM reseller to gain marketshare against AMD. AMD didn't have any money back then to oppose those tactics and went under. They tried with the bulldozer arch to fight at least the server front and they lost soundly because of the Intel's core arch efficiency back then. Zen arch came along in 2017 and vanished that gap in efficiency. Zen2 on 7nm with the chiplet design made AMD a clear leader on efficiency and will continue to be so for at least 2 more years.

As for the MATLAB fiasco that a user fixed the dev's code enabling the Zen CPUs to run much better by using the instruction sets they have into them: Companies work to make money. So, AMD should be willing to fight back those tactics by approaching devs and make them work fairly to their products or show them off to the public if they deny to do so. Negative advertisement is a bad thing for sw devs although some say there isn't such thing.
 
Joined
Jun 4, 2004
Messages
419 (0.07/day)
System Name Octopussy
Processor 1x Intel Core-i7 3930K @ 4.8GHz
Motherboard ASUS Rampage IV Extreme
Cooling Full water cooling, mostly Aqua Computer and EKWB stuff!
Memory 4x 4GB Corsair Dominator GT DDR3-2133 9-11-10-27-1T
Video Card(s) 2x Gigabyte GTX 980Ti G1 Gaming in SLI
Storage 8x Hitachi 5K3000 3TB RAID-6 (Adaptec 5805); 1x 512GB Samsung 950Pro (Angelbird Wings PX1)
Display(s) PLP: Dell 2007FBb 20" TFT + Dell U3011 30" TFT + Dell 2007FBb 20" TFT
Case CaseLabs TH10A
Audio Device(s) ASUS Xonar Phoebus
Power Supply SeaSonic SS-1000XP
Mouse Logitech MX Master
Keyboard SteelSeries Apex M800
Software MS Windows 10 Pro x64
Benchmark Scores A lot.
The problem here as I see it is a bit different: For compiled code only checking if there is a certain type of CPU installed and not if the installed CPU has certain features to use and decide based on that which code-path to use is problematic. The end user cannot decide (and should not) what instructions the program in front of him uses to get the job done. Also, the user in this case often does not have the choice to use another set of compiled binaries for his preferred CPU. At the end, a developer who is developing applications for a broader audience should clearly pass on Intel's compiler and use something more appropriate (or deliver different sets of compiled code for different CPUs like some devs already do). A user on the other hand simply don't have a choice. He has to use the tools available to him.
Also I don't blame AMD for this, developing a highly optimized compiler is really hard work and may cost a ton of money, lots of good developers and time. Intel has a clear lead here.

I guess we should ask more questions, maybe in the end that's also where Intel's lead in gaming performance comes from after all? :D
 
Joined
Jul 16, 2014
Messages
3,223 (1.63/day)
Location
SE Michigan
System Name Dumbass
Processor AMD-9370BE @4.6
Motherboard ASUS SABERTOOTH 990FX R2.0 +SB950
Cooling CM Nepton 280L
Memory G.Skill Sniper 16gb DDR3 2400
Video Card(s) GreenTeam 1080 Gaming X 8GB
Storage C:\SSD (240GB), D:\Seagate (2TB), E:\Western Digital (1TB)
Display(s) 1x Nixeus NX_EDG27, 2x Dell S2440L (16:9)
Case Phanteks Enthoo Primo w/8 140mm SP Fans
Audio Device(s) onboard (realtek?) SPKRS:Logitech Z623 200w 2.1
Power Supply Corsair HX1000i
Mouse Logitech G700s
Keyboard Logitech G910 Orion Spark
Software windows 10
Benchmark Scores https://i.imgur.com/aoz3vWY.jpg?2
A small and free history lesson for anyone failing or not willing to understand how we ended up here with the CPU market.

For over a decade now, Intel triy to bribe any dev or OEM reseller to gain marketshare against AMD. AMD didn't have any money back then to oppose those tactics and went under. They tried with the bulldozer arch to fight at least the server front and they lost soundly because of the Intel's core arch efficiency back then. Zen arch came along in 2017 and vanished that gap in efficiency. Zen2 on 7nm with the chiplet design made AMD a clear leader on efficiency and will continue to be so for at least 2 more years.

As for the MATLAB fiasco that a user fixed the dev's code enabling the Zen CPUs to run much better by using the instruction sets they have into them: Companies work to make money. So, AMD should be willing to fight back those tactics by approaching devs and make them work fairly to their products or show them off to the public if they deny to do so. Negative advertisement is a bad thing for sw devs although some say there isn't such thing.
with this, remember the transition form x32 to x64, how often applications and games had 2 different executables to use, which, shockingly, depending on the CPU. Since a script is and easy fix, I dont see the need for a separate executable. I Have seen in the past excutables tagged separately for Intel and AMD, tho its been so long I cant remember when or what exactly but i think it was during XP/vista OS days
 
Top