Wednesday, May 11th 2022

NVIDIA Releases Open-Source GPU Kernel Modules

NVIDIA is now publishing Linux GPU kernel modules as open source with dual GPL/MIT license, starting with the R515 driver release. You can find the source code for these kernel modules in the NVIDIA Open GPU Kernel Modules repo on GitHub. This release is a significant step toward improving the experience of using NVIDIA GPUs in Linux, for tighter integration with the OS and for developers to debug, integrate, and contribute back. For Linux distribution providers, the open-source modules increase ease of use.

They also improve the out-of-the-box user experience to sign and distribute the NVIDIA GPU driver. Canonical and SUSE are able to immediately package the open kernel modules with Ubuntu and SUSE Linux Enterprise Distributions. Developers can trace into code paths and see how kernel event scheduling is interacting with their workload for faster root cause debugging. In addition, enterprise software developers can now integrate the driver seamlessly into the customized Linux kernel configured for their project.
This will further help improve NVIDIA GPU driver quality and security with input and reviews from the Linux end-user community. With each new driver release, NVIDIA publishes a snapshot of the source code on GitHub. Community submitted patches are reviewed and if approved, integrated into a future driver release.

Supported functionality

The first release of the open GPU kernel modules is R515. Along with the source code, fully-built and packaged versions of the drivers are provided.

For data center GPUs in the NVIDIA Turing and NVIDIA Ampere architecture families, this code is production ready. This was made possible by the phased rollout of the GSP driver architecture over the past year, designed to make the transition easy for NVIDIA customers. We focused on testing across a wide variety of workloads to ensure feature and performance parity with the proprietary kernel-mode driver.

In the future, functionality such as HMM will be a foundational component for confidential computing on the NVIDIA Hopper architecture.

In this open-source release, support for GeForce and Workstation GPUs is alpha quality. GeForce and Workstation users can use this driver on Turing and NVIDIA Ampere architecture GPUs to run Linux desktops and use features such as multiple displays, G-SYNC, and NVIDIA RTX ray tracing in Vulkan and NVIDIA OptiX. Users can opt in using the kernel module parameter NVreg_EnableUnsupportedGpus as highlighted in the documentation. More robust and fully featured GeForce and Workstation support will follow in subsequent releases and the NVIDIA Open Kernel Modules will eventually supplant the closed-source driver.

Customers with Turing and Ampere GPUs can choose which modules to install. Pre-Turing customers will continue to run the closed source modules.

The open-source kernel-mode driver works with the same firmware and the same user-mode stacks such as CUDA, OpenGL, and Vulkan. However, all components of the driver stack must match versions within a release. For instance, you cannot take a release of the source code, build, and run it with the user-mode stack from a previous or future release.

Refer to the driver README document for instructions on installing the right versions and additional troubleshooting steps.

Installation opt in

The R515 release contains precompiled versions of both the closed-source driver and the open-source kernel modules. These versions are mutually exclusive, and the user can make the choice at install time. The default option ensures that silent installs will pick the optimal path for NVIDIA Volta and older GPUs versus Turing+ GPUs.

Users can build kernel modules from the source code and install them with the relevant user-mode drivers.

Partner ecosystem

NVIDIA has been working with Canonical, Red Hat, and SUSE for better packaging, deployment, and support models for our mutual customers.

Canonical

"The new NVIDIA open-source GPU kernel modules will simplify installs and increase security for Ubuntu users, whether they're AI/ML developers, gamers, or cloud users," commented Cindy Goldberg, VP of Silicon alliances at Canonical. "As the makers of Ubuntu, the most popular Linux-based operating system for developers, we can now provide even better support to developers working at the cutting edge of AI and ML by enabling even closer integration with NVIDIA GPUs on Ubuntu."

In the coming months, the NVIDIA Open GPU kernel modules will make their way into the recently launched Canonical Ubuntu 22.04 LTS.

SUSE

"We at SUSE are excited that NVIDIA is releasing their GPU kernel-mode driver as open source. This is a true milestone for the open-source community and accelerated computing. SUSE is proud to be the first major Linux distribution to deliver this breakthrough with SUSE Linux Enterprise 15 SP4 in June. Together, NVIDIA and SUSE power your GPU-accelerated computing needs across cloud, data center, and edge with a secure software supply chain and excellence in support." — Markus Noga, General Manager, Business Critical Linux at SUSE

Red Hat

"Enterprise open source can spur innovation and improve customers' experience, something that Red Hat has always championed. We applaud NVIDIA's decision to open source its GPU kernel driver. Red Hat has collaborated with NVIDIA for many years, and we are excited to see them take this next step. We look forward to bringing these capabilities to our customers and to improve interoperability with NVIDIA hardware." — Mike McGrath, Vice President, Linux Engineering at Red Hat

Upstream approach

NVIDIA GPU drivers have been designed over the years to share code across operating systems, GPUs and Jetson SOCs so that we can provide a consistent experience across all our supported platforms. The current codebase does not conform to the Linux kernel design conventions and is not a candidate for Linux upstream.

There are plans to work on an upstream approach with the Linux kernel community and partners such as Canonical, Red Hat, and SUSE.

In the meantime, published source code serves as a reference to help improve the Nouveau driver. Nouveau can leverage the same firmware used by the NVIDIA driver, exposing many GPU functionalities, such as clock management and thermal management, bringing new features to the in-tree Nouveau driver.

Frequently asked questions

Where can I download the R515 driver?

You can download the R515 development driver as part of CUDA Toolkit 11.7, or from the driver downloads page under "Beta" drivers. The R515 data center driver will follow in subsequent releases per our usual cadence.

Can open GPU Kernel Modules be distributed?

Yes, the NVIDIA open kernel modules are licensed under a dual GPL/MIT license; and the terms of licenses govern the distribution and repackaging grants.

Will the source for user-mode drivers such as CUDA be published?

These changes are for the kernel modules; while the user-mode components are untouched. So the user-mode will remain closed source and published with pre-built binaries in the driver and the CUDA toolkit.

Which GPUs are supported by Open GPU Kernel Modules?

Open kernel modules support all Ampere and Turing GPUs. Datacenter GPUs are supported for production, and support for GeForce and Workstation GPUs is alpha quality. Please refer to the Datacenter, NVIDIA RTX, and GeForce product tables for more details (Turing and above have compute capability of 7.5 or greater).

How to report bugs

Problems can be reported through the GitHub repository issue tracker or through our existing end-user support forum. Please report security issues through the channels listed on the GitHub repository security policy.

What is the process for patch submission and SLA/CLA for patches?

We encourage community submissions through pull requests on the GitHub page. The submitted patches will be reviewed and if approved, integrated with possible modifications into a future driver release. Please refer to the NVIDIA driver lifecycle document.

The published source code is a snapshot generated from a shared codebase, so contributions may not be reflected as separate Git commits in the GitHub repo. We are working on a process for acknowledging community contributions. We also advise against making significant reformatting of the code for the same reasons.

The process for submitting pull requests is described on our GitHub page and such contributions are covered under the Contributor License Agreement.
Source: NVIDIA
Add your own comment

35 Comments on NVIDIA Releases Open-Source GPU Kernel Modules

#1
Solaris17
Dainty Moderator
wow no shit, how long has it been since linus got mad at them? atleast a decade.
Posted on Reply
#2
DrCR
Solaris17wow no shit, how long has it been since linus got mad at them? atleast a decade.
Yeah, this fits into the ‘about time’ category. I’m presuming there was a enterprise sector that started demanding this to happen.
Posted on Reply
#3
windwhirl
Solaris17wow no shit, how long has it been since linus got mad at them? atleast a decade.
Mandatory mind-refresher for those that don't know/remember :):D:laugh::


Almost. Apparently the famous scene of Torvalds giving them the middle finger seems to have happened at a talk in Finland (?), on June 14, 2012.
Posted on Reply
#4
prtskg
DrCRYeah, this fits into the ‘about time’ category. I’m presuming there was a enterprise sector that started demanding this to happen.
Or was it due to lapsus group demanding it?
Posted on Reply
#5
Ferrum Master
Only Ampere and Turing... yeah thanks nvidia... F*** you too...
Posted on Reply
#6
bug
Ferrum MasterOnly Ampere and Turing... yeah thanks nvidia... F*** you too...
I know. Because when AMD did it, they supported everything since Rage3D, right? :wtf:

Unlike AMD, Nvidia has some cards working with the new driver since day 1. Unfortunately for us, these are datacenter SKUs. It gives you an idea who gave them the business case to go this route.

There are more steps to come before home users can reap the benefits, but the most important step is now behind us :clap:
Posted on Reply
#7
R-T-B
Solaris17wow no shit, how long has it been since linus got mad at them? atleast a decade.
I really never thought I'd see the day they open sourced their binary linux drivers... this is awesome news.

Can't wait for something supporting my ampere card... oh, and for linux to finally support HDR...

*goes back to waiting...
Posted on Reply
#8
Ferrum Master
bugbut the most important step is now behind us :clap:
yeah, I won't be buying nvidia for a long while, since it caused me so much Linux headaches.
Posted on Reply
#9
R-T-B
Ferrum Masteryeah, I won't be buying nvidia for a long while, since it caused me so much Linux headaches.
This is probably the begining of the end for that.
Posted on Reply
#10
DrCR
I’m on old hardware so haven’t paid attention recently. For Linux gamers, ATI (I’m old) AMD is the accepted way to go now? That would be a marked difference from when I bought my Kepler.
Posted on Reply
#11
Ferrum Master
R-T-BThis is probably the begining of the end for that.
I am pascal user... can't see that. I even gazed at new nouveau error at boot just yesterday, probably because of kernel update. Basically, you can't have a secure updated environment, having up to date kernel with all patches and having not so new nvidia card, something always goes haywire at unexpected moments.

Well, I agree, that it is a move into the right direction, but I will hold my breath till some insiders will tell what actually nvidia made open sauce and what are the drawbacks, there are always some.

Well the cause why Torvalds showed the middle finger still won't be solved either way. I will laugh hard if the heterogenous intel+nvidia system code actually isn't shared also and older solutions are still badly implemented.
Posted on Reply
#12
bug
DrCRI’m on old hardware so haven’t paid attention recently. For Linux gamers, ATI (I’m old) AMD is the accepted way to go now? That would be a marked difference from when I bought my Kepler.
For gaming Nvidia is still the better option (the performance gap has narrowed(. AMD's drivers adhere to Mesa, making them a better choice for Wayland. It's still a case of "pick your poison", but I imagine it won't be in a couple of years or so.
Posted on Reply
#13
trparky
Ferrum MasterBasically, you can't have a secure updated environment, having up to date kernel with all patches and having not so new nvidia card, something always goes haywire at unexpected moments.
You can put some of the blame on the Linux kernel itself, namely in the fact that every time you turn around, they change an internal API thus breaking shit. Now if only the kernel had a stable set external APIs that developers would be able to rely on not changing from this week to the next things would be great, but the Linux kernel community is allergic to this idea. Their answer to that issue is to just put your code in the mainline kernel tree and if things happen to break, we'll fix it for you.

But what if you don't want your code to be open source? Oops, sorry. We don't care about you.
Posted on Reply
#14
bug
trparkyYou can put some of the blame on the Linux kernel itself, namely in the fact that every time you turn around, they change an internal API thus breaking shit. Now if only the kernel had a stable set external APIs that developers would be able to rely on not changing from this week to the next things would be great, but the Linux kernel community is allergic to this idea. Their answer to that issue is to just put your code in the mainline kernel tree and if things happen to break, we'll fix it for you.

But what if you don't want your code to be open source? Oops, sorry. We don't care about you.
They care a great deal about you, otherwise they won't be maintaining the documentation that lets you keep your code up to date.
Posted on Reply
#15
Aretak
bugI know. Because when AMD did it, they supported everything since Rage3D, right? :wtf:
AMDGPU supports everything back to and including the HD 7000 series from 2011. Certainly a hell of a lot better than just the last two product ranges. But hey, you keep on spinning, my dear little fanboy. :)
Posted on Reply
#16
bug
AretakAMDGPU supports everything back to and including the HD 7000 series from 2011. Certainly a hell of a lot better than just the last two product ranges. But hey, you keep on spinning, my dear little fanboy. :)
When AMDGPU was released, HD 7000 was 4 years old. Turing will be 4 years old this September.
Posted on Reply
#17
trparky
bugThey care a great deal about you, otherwise they won't be maintaining the documentation that lets you keep your code up to date.
I know that this might go off on a tangent here but from what I've heard the biggest barrier to keeping older Android devices updated is that the when the kernel changes the Qualcomm modem and chipset drivers COMPLETELY break. Why is that? Because somewhere along the line, something changed in the kernel that broke everything. Now if there was, like I said before, a stable set of APIs that won't change no matter what, this scenario would not happen, Qualcomm drivers would continue to work no matter what kernel version you have, and even the likes of an old Galaxy S8 would be able to be updated without any issues at all.

But since we don't live in that happy world we have a massive eWaste problem with hundreds of thousands of old Android phones being tossed out every damn year all because oops, the Qualcomm drivers break. I don't know who to blame in this situation, but something needs to be done about it.
Posted on Reply
#18
bug
trparkyI know that this might go off on a tangent here but from what I've heard the biggest barrier to keeping older Android devices updated is that the when the kernel changes the Qualcomm modem and chipset drivers COMPLETELY break. Why is that? Because somewhere along the line, something changed in the kernel that broke everything. Now if there was, like I said before, a stable set of APIs that won't change no matter what, this scenario would not happen, Qualcomm drivers would continue to work no matter what kernel version you have, and even the likes of an old Galaxy S8 would be able to be updated without any issues at all.

But since we don't live in that happy world we have a massive eWaste problem with hundreds of thousands of old Android phones being tossed out every damn year all because oops, the Qualcomm drivers break. I don't know who to blame in this situation, but something needs to be done about it.
First of all, the drivers don't break "completely" (otherwise Lineage would be able to bring support back from the dead from time to time), it's just that Qualcomm (and others) disband their support teams as soon as they have another chipset on the market they'd rather sell you.
Second, this hasn't been an issue sine Project Treble, over 2 years ago. The OS noe has an abstraction layer on top everything that's driver related.
Third, trying to argue policies governing one of the most successful project on the planet are wrong is kinda self-defeating, don't you think? I mean, sure, stable APIs have advantages, but it's not like they don't come with drawbacks of their own.
Posted on Reply
#19
trparky
bugit's just that Qualcomm (and others) disband their support teams as soon as they have another chipset on the market they'd rather sell you.
OK then, let's blame Qualcomm here. They'd rather rake in the money hand over fist than keep their old stuff up to date. Yes, I know... It's business, it's nothing personal. But seriously, it's contributing to a massive eWaste issue that I don't see being solved any time soon.
bugSecond, this hasn't been an issue since Project Treble, over 2 years ago. The OS now has an abstraction layer on top everything that's driver related.
That's not solved all the issues, there's still a lot of older devices that aren't being supported properly. Samsung devices being the worst of them all.
bugThird, trying to argue policies governing one of the most successful project on the planet are wrong is kinda self-defeating, don't you think? I mean, sure, stable APIs have advantages, but it's not like they don't come with drawbacks of their own.
I don't know man, it's pretty nice that a printer driver written for Windows 7 works perfectly fine for Windows 10/11.
Posted on Reply
#20
windwhirl
trparkyI know that this might go off on a tangent here but from what I've heard the biggest barrier to keeping older Android devices updated is that the when the kernel changes the Qualcomm modem and chipset drivers completely break.
The question is why do they break when everything else on any other platform doesn't seem to break? Either Google is at fault for whatever the hell they do or Qualcomm just flipped the middle finger at everyone every time they released a new SoC. In any case, if there's something breaking in the code, that'd have been Google's fault, since they shipped the actual Android release (Android might be based on Linux, but Google introduces modifications in it):



This was a mess, so they later introduced Treble with devices that shipped from factory with Android 8.0. Devices that shipped with earlier versions and later updated to Android 8 do not count for this:

And finally, Project Mainline is about moving the responsibility of maintaining some core components updated from manufacturers to Google (like the media framework/codecs or the Android Runtime).

So, IMO, you can look at three people for something "breaking":
Google (again, Linux kernel doesn't break stuff save for like once in a blue moon, and those changes are warned months if not years in advance, so if something breaks, it's Google's fault)
SoC designers (because why keep working on old products)
OEMs (lazy cheap bums all of them)

EDIT: Well, I got ninja'ed :laugh:
Posted on Reply
#21
trparky
windwhirlOEMs (lazy cheap bums all of them)
We have to toss the carriers onto the firepit as well.
Posted on Reply
#22
prtskg
DrCRI’m on old hardware so haven’t paid attention recently. For Linux gamers, ATI (I’m old) AMD is the accepted way to go now? That would be a marked difference from when I bought my Kepler.
A quick read at Phoronix says that AMD is the better choice now whether it's gaming performance or stability. Nvidia has advantage in ray tracing and better compute performance thanks to CUDA.
Posted on Reply
#23
bug
trparkyI don't know man, it's pretty nice that a printer driver written for Windows 7 works perfectly fine for Windows 10/11.
With the obvious drawback that Windows keeps several implementations around for everything (several WDM revisions) and each and every one is a potential attack vector for hackers, so they have to maintain them all. Not to mention it makes Windows pretty much bloatware.
Posted on Reply
#24
trparky
bugWith the obvious drawback that Windows keeps several implementations around for everything (several WDM revisions) and each and every one is a potential attack vector for hackers, so they have to maintain them all. Not to mention it makes Windows pretty much bloatware.
I get it, backwards compatibility can be a damned if you do, damned if you don't kind of situation; I understand that.

But let's face the facts here. Microsoft has tried multiple times to remove backwards compatibility support for older hardware in the past only to be faced with the Internet equivalent of an angry mob with pitchforks and torches. Case in point, Windows 11's removal of support for older processors. They received ungodly amounts of blowback for that. Truly a no-win scenario for Microsoft.
Posted on Reply
#25
bug
trparkyI get it, backwards compatibility can be a damned if you do, damned if you don't kind of situation; I understand that.

But let's face the facts here. Microsoft has tried multiple times to remove backwards compatibility support for older hardware in the past only to be faced with the Internet equivalent of an angry mob with pitchforks and torches. Case in point, Windows 11's removal of support for older processors. They received ungodly amounts of blowback for that. Truly a no-win scenario for Microsoft.
I'm not denying any of that. I'm just saying, they're different approaches and neither is objectively better that the other. Your wish for stable APIs in the Linux kernel just sounds like "grass is greener on the other side" argument to me.
Posted on Reply
Add your own comment
May 28th, 2022 04:35 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts