Monday, November 18th 2019

MATLAB MKL Codepath Tweak Boosts AMD Ryzen MKL Performance Significantly

MATLAB is a popular math computing environment in use by engineering firms, universities, and other research institutes. Some of its operations can be made to leverage Intel MKL (Math Kernel Library), which is poorly optimized for, and notoriously slow on AMD Ryzen processors. Reddit user Nedflanders1976 devised a way to restore anywhere between 20 to 300 percent performance on Ryzen and Ryzen Threadripper processors, by forcing MATLAB to use advanced instruction-sets such as AVX2. By default, MKL queries your processor's vendor ID string, and if it sees anything other than "GenuineIntel...," it falls back to SSE, posing a significant performance disadvantage to "AuthenticAMD" Ryzen processors that have a full IA SSE4, AVX, and AVX2 implementation.

The tweak, meant to be manually applied by AMD Ryzen users, forces MKL to use AVX2 regardless of the CPU Vendor ID query result. The tweak is as simple as it is powerful. A simple 4-line Windows batch file with a set of arguments starts MKL in AVX2 mode. You can also make the tweak "permanent" by creating a system environment variable. The environment variable will apply to all instances of MATLAB, and not just those spawned by the batch file. Nedflanders1976 also posted a benchmark script that highlights the performance impact of AVX2, however you can use your own scripts and post results.
Source: Nedflanders1976 (Reddit)
Add your own comment

67 Comments on MATLAB MKL Codepath Tweak Boosts AMD Ryzen MKL Performance Significantly

#1
ncrs
This is nothing new. Intel has been playing dirty for a long time.

MKL can be used to accelerate a lot of Python stuff, is the default math library in Microsoft R Open. Basically academic uses benefit a lot from it. This "fix" also works on all those.
Posted on Reply
#2
Xaled
A tweak? Actually this is a work around Intel's cheats. Both İnyle and MATLAB developers should punished for this
Posted on Reply
#3
Vya Domus
Xaled
Both İnyle and MATLAB developers should punished for this
Good luck with that, the FTC settlement about Intel's compiler tricks is perhaps one of the stupidest thing I ever read. It basically forces Intel to disclose that their compilers are biased only to the developer using it and not to the end user. In other words, it's completely worthless, Intel is still free to do whatever they want and distribute software that intentionally cripples performance on the user's end.
Posted on Reply
#4
W1zzard
Anyone using Matlab here? Would love to get some real-life scenario data for my CPU reviews
Posted on Reply
#5
Vya Domus
W1zzard
Anyone using Matlab here? Would love to get some real-life scenario data for my CPU reviews
I do, not extensively though.
Posted on Reply
#6
R-T-B
Is matlab even built with the Intel compiler, or is this just dumb programming?

I'm suspecting the ICC was used but can't rule out dev stupidity either...
Posted on Reply
#7
ratirt
Vya Domus
Good luck with that, the FTC settlement about Intel's compiler tricks is perhaps one of the stupidest thing I ever read. It basically forces Intel to disclose that their compilers are biased only to the developer using it and not to the end user. In other words, it's completely worthless, Intel is still free to do whatever they want and distribute software that intentionally cripples performance on the user's end.
Normally, I always agree to what you write but are you sure about this? Sooner or later this subject will be brought to light. In an era where Intel was superior in performance that maybe would have been let go but now things are changing. It is Karma and sooner or later Intel will have to answer to scams and tricks.
Posted on Reply
#8
notb
Why do we even get Matlab news here? WTF?

As for MKL - it's used by a lot of computing software. Why? Becase it makes stuff run faster on Intel CPUs. Why would it not be used? This is how computing works. Intel has given developers an API to speed up their programs. Why is Intel attacked for this on this forum? It should be praised.

AMD is also allowed to offer an API optimized for Zen. And I'm sure software developers will gladly implement it as AMD CPUs gain popularity.

For a decade there was really no reason to optimize software for AMD.
Posted on Reply
#9
C1ff0
@notb it's have been appured that it's not the case of intel speeding up their processors, but crippling the competition artificially (it was like, there was a check for the vendor ID [intel, amd, via] of the cpu). I remember an old article in witch there was show proof of this,but i can't find the link, someone have it? it was really interesting to read.

Edit: i found the article!! here it is: https://www.agner.org/optimize/blog/read.php?i=49#49
Posted on Reply
#10
ratirt
notb
Why do we even get Matlab news here? WTF?

As for MKL - it's used by a lot of computing software. Why? Becase it makes stuff run faster on Intel CPUs. Why would it not be used? This is how computing works. Intel has given developers an API to speed up their programs. Why is Intel attacked for this on this forum? It should be praised.

AMD is also allowed to offer an API optimized for Zen. And I'm sure software developers will gladly implement it as AMD CPUs gain popularity.

For a decade there was really no reason to optimize software for AMD.
Crippling other companies' products is not speeding your product up although it looks better in comparison.
The article OP is referring to is proving that you can work around the crippling procedure Intel has implemented with AMD processors.
Posted on Reply
#11
PanicLake
notb
Why do we even get Matlab news here? WTF?

As for MKL - it's used by a lot of computing software. Why? Becase it makes stuff run faster on Intel CPUs. Why would it not be used? This is how computing works. Intel has given developers an API to speed up their programs. Why is Intel attacked for this on this forum? It should be praised.

AMD is also allowed to offer an API optimized for Zen. And I'm sure software developers will gladly implement it as AMD CPUs gain popularity.

For a decade there was really no reason to optimize software for AMD.
The fact (or problem) is that as demonstrated by this article you don't actually need an AMD provided API to achieve better performance.
Posted on Reply
#12
Vya Domus
ratirt
Sooner or later this subject will be brought to light.
But that's the thing, it was brought to attention.

https://www.ftc.gov/news-events/press-releases/2010/08/ftc-settles-charges-anticompetitive-conduct-against-intel

"disclose to software developers that Intel computer compilers discriminate between Intel chips and non-Intel chips, and that they may not register all the features of non-Intel chips. Intel also will have to reimburse all software vendors who want to recompile their software using a non-Intel compiler. "

Aka, "carry on".

I wouldn't even call it a slap on the wrist, that'd be too much. This is all too well documented and went on for dozens of years at this point so, yes, I would say nothing is ever going to change this. Technically this was already settled and no one is going to go back to it.
Posted on Reply
#13
ZoneDymo
95% of the market in your hands, still turn to this kind of stuff, for shame Intel, pathetic.
Posted on Reply
#14
ratirt
Vya Domus
But that's the thing, it was brought to attention.

https://www.ftc.gov/news-events/press-releases/2010/08/ftc-settles-charges-anticompetitive-conduct-against-intel

"disclose to software developers that Intel computer compilers discriminate between Intel chips and non-Intel chips, and that they may not register all the features of non-Intel chips. Intel also will have to reimburse all software vendors who want to recompile their software using a non-Intel compiler. "

Aka, "carry on".

I wouldn't even call it a slap on the wrist, that'd be too much. This is all too well documented and went on for dozens of years at this point so, yes, I would say nothing is ever going to change this. Technically this was already settled and no one is going to go back to it.
Well I'm a believer so .... :)
Posted on Reply
#15
DeathtoGnomes
its taken 10 years, or so, and thanks to a new instruction set, developers can crawl out from under Intels thumb. What tthis shows is that this, whatever you wanna call it, work around, etc. can now be used to tweak other "Intel Only" software.
Posted on Reply
#16
Mysteoa
Stuff like this is going to be a problem for a long time. AMD doesn't have as much resources to spend on software optimization as Intel has. Intel is using that as to way to slow down AMD optimization, by paying for optimization time for their stuff, so the devs will not have time to work on optimizing for AMD.
Posted on Reply
#17
ratirt
Mysteoa
Stuff like this is going to be a problem for a long time. AMD doesn't have as much resources to spend on software optimization as Intel has. Intel is using that as to way to slow down AMD optimization, by paying for optimization time for their stuff, so the devs will not have time to work on optimizing for AMD.
You have slightly missed the point. Intel didn't spend money to optimize Intel's CPUs but to make competitive companies processors use different code path to cripple their performance. What it means is, if you are Intel you go the faster way (which competition can go as well but it is exclusive) if you are not you will be stuck with the code that is slow as molasses.
Posted on Reply
#18
Breit
I'm using a lot of MATLAB at work and we are in the process of migrating most of the newer code over to Python now using numpy. Reading this I wonder if there are similiar gains to be had for the MKL version of numpy? Hmmm...

Anyhow, this environment variable seems like something Intel has build into the MKL for debugging purposes. If that is making such a difference, I'm sure Intel will "fix" this in future release of the MKL! :kookoo:
Posted on Reply
#19
ncrs
Breit
I'm using a lot of MATLAB at work and we are in the process of migrating most of the newer code over to Python now using numpy. Reading this I wonder if there are similiar gains to be had for the MKL version of numpy? Hmmm...
Yes, Anaconda has mkl and numpy/scipy support integrated.
Posted on Reply
#20
Breit
ncrs
Yes, Anaconda has mkl and numpy/scipy support integrated.
Unfortunately, I don't have an AMD system here to test something on. :ohwell:
Posted on Reply
#21
ncrs
Breit
Unfortunately, I don't have an AMD system here to test something on. :ohwell:
I meant the difference between standard numpy and MKL-powered one. Depending on the operation type the difference can be huge.
Posted on Reply
#22
HD64G
A small and free history lesson for anyone failing or not willing to understand how we ended up here with the CPU market.

For over a decade now, Intel triy to bribe any dev or OEM reseller to gain marketshare against AMD. AMD didn't have any money back then to oppose those tactics and went under. They tried with the bulldozer arch to fight at least the server front and they lost soundly because of the Intel's core arch efficiency back then. Zen arch came along in 2017 and vanished that gap in efficiency. Zen2 on 7nm with the chiplet design made AMD a clear leader on efficiency and will continue to be so for at least 2 more years.

As for the MATLAB fiasco that a user fixed the dev's code enabling the Zen CPUs to run much better by using the instruction sets they have into them: Companies work to make money. So, AMD should be willing to fight back those tactics by approaching devs and make them work fairly to their products or show them off to the public if they deny to do so. Negative advertisement is a bad thing for sw devs although some say there isn't such thing.
Posted on Reply
#23
Breit
The problem here as I see it is a bit different: For compiled code only checking if there is a certain type of CPU installed and not if the installed CPU has certain features to use and decide based on that which code-path to use is problematic. The end user cannot decide (and should not) what instructions the program in front of him uses to get the job done. Also, the user in this case often does not have the choice to use another set of compiled binaries for his preferred CPU. At the end, a developer who is developing applications for a broader audience should clearly pass on Intel's compiler and use something more appropriate (or deliver different sets of compiled code for different CPUs like some devs already do). A user on the other hand simply don't have a choice. He has to use the tools available to him.
Also I don't blame AMD for this, developing a highly optimized compiler is really hard work and may cost a ton of money, lots of good developers and time. Intel has a clear lead here.

I guess we should ask more questions, maybe in the end that's also where Intel's lead in gaming performance comes from after all? :D
Posted on Reply
#24
DeathtoGnomes
HD64G
A small and free history lesson for anyone failing or not willing to understand how we ended up here with the CPU market.

For over a decade now, Intel triy to bribe any dev or OEM reseller to gain marketshare against AMD. AMD didn't have any money back then to oppose those tactics and went under. They tried with the bulldozer arch to fight at least the server front and they lost soundly because of the Intel's core arch efficiency back then. Zen arch came along in 2017 and vanished that gap in efficiency. Zen2 on 7nm with the chiplet design made AMD a clear leader on efficiency and will continue to be so for at least 2 more years.

As for the MATLAB fiasco that a user fixed the dev's code enabling the Zen CPUs to run much better by using the instruction sets they have into them: Companies work to make money. So, AMD should be willing to fight back those tactics by approaching devs and make them work fairly to their products or show them off to the public if they deny to do so. Negative advertisement is a bad thing for sw devs although some say there isn't such thing.
with this, remember the transition form x32 to x64, how often applications and games had 2 different executables to use, which, shockingly, depending on the CPU. Since a script is and easy fix, I dont see the need for a separate executable. I Have seen in the past excutables tagged separately for Intel and AMD, tho its been so long I cant remember when or what exactly but i think it was during XP/vista OS days
Posted on Reply
#25
Breit
DeathtoGnomes
with this, remember the transition form x32 to x64, how often applications and games had 2 different executables to use, which, shockingly, depending on the CPU. Since a script is and easy fix, I dont see the need for a separate executable. I Have seen in the past excutables tagged separately for Intel and AMD, tho its been so long I cant remember when or what exactly but i think it was during XP/vista OS days
...as long as Intel doesn't get more sophisticated with what it's own debug switches do in the MKL in this case. :D
Posted on Reply
Add your own comment