• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Releases CUDA Toolkit 4.1

Joined
Dec 6, 2011
Messages
4,784 (1.06/day)
Location
Still on the East Side
NVIDIA today released a new version of its CUDA parallel computing platform, which will make it easier for computational biologists, chemists, physicists, geophysicists, other researchers, and engineers to advance their simulations and computational work by using GPUs.

The new NVIDIA CUDA parallel computing platform features three key enhancements that make parallel programing with GPUs easier, more accessible and faster. These include:

- Re-designed Visual Profiler with automated performance analysis, providing an easier path to application acceleration
- New compiler, based on the widely-used LLVM open-source compiler infrastructure, delivering up to 10 percent speed up in application performance
- Hundreds of new imaging and signal processing functions, doubling the size of the NVIDIA Performance Primitives (NPP) library



"The new visual profiler is amazing," said Joshua Anderson, lead developer of the HOOMD-blue open source molecular dynamics project. "With just a few clicks, it performs an automated performance analysis of your application, highlights likely problem areas, and then provides links to best-practice suggestions on improving them. It makes it quick and easy for virtually all developers to accelerate a broad range of applications."

"The LLVM complier gave me an almost immediate 10 percent performance speed up, just by recompiling my existing real-time financial risk analysis code," said Gilles Civario, senior software architect at the Irish Centre for High-End Computing. "I can only imagine the additional performance gains I can achieve with additional tuning using the new CUDA release."

Among the new features of the latest CUDA parallel computing platform release - available free of charge on the NVIDIA developer web site at http://developer.nvidia.com/getcuda - are:

New Visual Profiler - Easiest path to performance optimization

The new Visual Profiler makes it easy for developers at all experience levels to optimize their code for maximum performance. Featuring automated performance analysis and an expert guidance system that delivers step-by-step optimization suggestions, the Visual Profiler identifies application performance bottlenecks and recommends actions, with links to the optimization guides. Using the new Visual Profiler, performance bottlenecks are easily identified and actionable.

LLVM Compiler - Instant 10 percent increase in application performance

LLVM is a widely-used open-source compiler infrastructure featuring a modular design that makes it easy to add support for new programming languages and processor architectures. Using the new LLVM-based CUDA compiler, developers can achieve up to 10 percent additional performance gains on existing GPU-accelerated applications with a simple recompile. In addition, LLVM's modular design allows third-party software tool developers to provide a custom LLVM solution for non-NVIDIA processor architectures, enabling CUDA applications to run across NVIDIA GPUs, as well as those from other vendors.

New Image, Signal Processing Library Functions - "Drop-in" Acceleration with NPP Library

NVIDIA has doubled the size of its NPP library, with the addition of hundreds of new image and signal processing functions. This enables virtually any developer using image or signal processing algorithms to easily gain the benefit of GPU acceleration, with the simple addition of library calls into their application. The updated NPP library can be used for a wide variety of image and signal processing algorithms, ranging from basic filtering to advanced workflows.

View at TechPowerUp Main Site
 
Joined
Jul 10, 2010
Messages
1,230 (0.24/day)
Location
USA, Arizona
System Name SolarwindMobile
Processor AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G
Motherboard Acer Wasp_BR
Cooling It's Copper.
Memory 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF
Video Card(s) ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER]
Storage TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB
Display(s) ViewSonic XG2401 SERIES
Case Acer Aspire E5-553G
Audio Device(s) Realtek ALC255
Power Supply PANASONIC AS16A5K
Mouse SteelSeries Rival
Keyboard Ducky Channel Shine 3
Software Windows 10 Home 64-bit (Version 1607, Build 14393.969)
LLVM Compiler w00t!

:rockout:

In addition, LLVM's modular design allows third-party software tool developers to provide a custom LLVM solution for non-NVIDIA processor architectures, enabling CUDA applications to run across NVIDIA GPUs, as well as those from other vendors.
CUDA on AMD?! :respect:
 
Last edited:
Joined
Jul 19, 2006
Messages
43,587 (6.72/day)
Processor AMD Ryzen 7 7800X3D
Motherboard ASUS TUF x670e
Cooling EK AIO 360. Phantek T30 fans.
Memory 32GB G.Skill 6000Mhz
Video Card(s) Asus RTX 4090
Storage WD m.2
Display(s) LG C2 Evo OLED 42"
Case Lian Li PC 011 Dynamic Evo
Audio Device(s) Topping E70 DAC, SMSL SP200 Headphone Amp.
Power Supply FSP Hydro Ti PRO 1000W
Mouse Razer Basilisk V3 Pro
Keyboard Tester84
Software Windows 11
LLVM Compiler w00t!

:rockout:


CUDA on AMD?! :respect:

Call me shocked! It really must be 2012... and free of charge?!

I hope AMD takes advantage of this, but for some reason I don't find it likely.. but who knows?
 
Joined
Jul 10, 2010
Messages
1,230 (0.24/day)
Location
USA, Arizona
System Name SolarwindMobile
Processor AMD FX-9800P RADEON R7, 12 COMPUTE CORES 4C+8G
Motherboard Acer Wasp_BR
Cooling It's Copper.
Memory 2 x 8GB SK Hynix/HMA41GS6AFR8N-TF
Video Card(s) ATI/AMD Radeon R7 Series (Bristol Ridge FP4) [ACER]
Storage TOSHIBA MQ01ABD100 1TB + KINGSTON RBU-SNS8152S3128GG2 128 GB
Display(s) ViewSonic XG2401 SERIES
Case Acer Aspire E5-553G
Audio Device(s) Realtek ALC255
Power Supply PANASONIC AS16A5K
Mouse SteelSeries Rival
Keyboard Ducky Channel Shine 3
Software Windows 10 Home 64-bit (Version 1607, Build 14393.969)
I hope AMD takes advantage of this, but for some reason I don't find it likely.. but who knows?

It's not AMD who is going to take advantage of this it is the Software Devs :laugh:
 
Joined
Mar 23, 2005
Messages
4,061 (0.58/day)
Location
Ancient Greece, Acropolis (Time Lord)
System Name RiseZEN Gaming PC
Processor AMD Ryzen 7 5800X @ Auto
Motherboard Asus ROG Strix X570-E Gaming ATX Motherboard
Cooling Corsair H115i Elite Capellix AIO, 280mm Radiator, Dual RGB 140mm ML Series PWM Fans
Memory G.Skill TridentZ 64GB (4 x 16GB) DDR4 3200
Video Card(s) ASUS DUAL RX 6700 XT DUAL-RX6700XT-12G
Storage Corsair Force MP500 480GB M.2 & MP510 480GB M.2 - 2 x WD_BLACK 1TB SN850X NVMe 1TB
Display(s) ASUS ROG Strix 34” XG349C 180Hz 1440p + Asus ROG 27" MG278Q 144Hz WQHD 1440p
Case Corsair Obsidian Series 450D Gaming Case
Audio Device(s) SteelSeries 5Hv2 w/ Sound Blaster Z SE
Power Supply Corsair RM750x Power Supply
Mouse Razer Death-Adder + Viper 8K HZ Ambidextrous Gaming Mouse - Ergonomic Left Hand Edition
Keyboard Logitech G910 Orion Spectrum RGB Gaming Keyboard
Software Windows 11 Pro - 64-Bit Edition
Benchmark Scores I'm the Doctor, Doctor Who. The Definition of Gaming is PC Gaming...
NVIDIA offering CUDA free of charge is a move of desperation. The more developers support it the better overall for NVIDIA. And how do you attract new developers and corporations to CUDA? By giving it away free. Good move by NVIDIA finally, but I don’t see AMD and Intel jumping in.
 
Joined
Nov 4, 2005
Messages
11,682 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs and over 10TB spinning
Display(s) 56" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
if they repeat the same things enough times people will believe it.


Hows that OpenCL working for you Nvidia?
 
Joined
Dec 22, 2011
Messages
3,890 (0.86/day)
Processor AMD Ryzen 7 3700X
Motherboard MSI MAG B550 TOMAHAWK
Cooling AMD Wraith Prism
Memory Team Group Dark Pro 8Pack Edition 3600Mhz CL16
Video Card(s) NVIDIA GeForce RTX 3080 FE
Storage Kingston A2000 1TB + Seagate HDD workhorse
Display(s) Samsung 50" QN94A Neo QLED
Case Antec 1200
Power Supply Seasonic Focus GX-850
Mouse Razer Deathadder Chroma
Keyboard Logitech UltraX
Software Windows 11
Yeah because OpenCL has really taken off. :laugh:
 
Joined
Sep 7, 2011
Messages
2,785 (0.60/day)
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
Joined
Nov 4, 2005
Messages
11,682 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs and over 10TB spinning
Display(s) 56" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Exactly.


On one hand we have a completely open standard, on the other CUDA a extension of X87 run on GPU cores, and now they have finally released a updated product after how long?
 
Joined
Apr 26, 2009
Messages
513 (0.09/day)
Location
You are here.
System Name Prometheus
Processor AMD Ryzen 9 5950x
Motherboard ASUS ROG Strix B550-I Gaming
Cooling EKWB EK-240 AIO D-RGB
Memory G.Skill Trident Z Neo 32GB
Video Card(s) MSI RTX 4070Ti Ventus 3X OC 12GB
Storage WD Black SN850 1TB + 1 x Samsung 970 Evo Plus 2TB
Display(s) DELL U4320Q 4K + Wacom Cintiq Pro 16 4K
Case Jonsbo A4 ver1.1 SFF
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Corsair SF750 Platinum SFX
Mouse Logitech Pro Wireless
Keyboard Vortex Race 3 75% MX Brown
Software Windows 11 Pro x64
That's not exactly true Steevo...

First of all, this open standard belongs to Apple and they license it to Khronos. From the Khronos webpage:

OpenCL is a trademark of Apple Inc., and is used under license by Khronos. The OpenCL logo and guidelines for its usage in association with Conformant products can be found here: http://developer.apple.com/softwarelicensing/agreements/opencl.html

If Khronos loses it's license or Apple sells OpenCL to someone or Khronos loses funding and so many other things that could happen, we could see OpenCL just die. The "Apple" part is of much concern to me.

It took a full year for Khronos to finally update OpenCL to version 1.2, and still the implementation lacks serious functionality for larger developers (like Adobe and the like) to have any real use for it. And with such a crawlingly slow development cycle, there is little interest from developers, because they can't wait for years to get the functionality they need.

Second, AMD also has his Close to Metal/Stream/APP (who knows what other names they'll give the technology) and OpenCL is built on that tech just as OpenCL is built on CUDA. In this respect AMD and NVIDIA support OpenCL in the same way with the exact same model.

Also there are other standards and they are in a way all competing with eachother, for example BrookGPU, NPP and many others. You can't expect companies to support just one standard when there are so many more. Especially when the development cycle is so slow.

You can look at Linux and how much fragmentation is in that market. At this point "Linux" is just an umbrella term to cover hundreds of operating systems. Versions of Linux that were updated frequently and they included the features the users actually need survived and grew their userbase.

At this time OpenCL is like an infant Linux distro that has a poor update cycle and does not include the features their userbase would require to start building applications on top of it.

And so developers will just use the next best thing, and most of the time, that is CUDA (and it's additional supporting libraries that are growing in number, and in "openness"), more then Stream/APP.
 

Cheeseball

Not a Potato
Supporter
Joined
Jan 2, 2009
Messages
1,842 (0.33/day)
Location
Pittsburgh, PA
System Name Titan
Processor AMD Ryzen™ 7 7950X3D
Motherboard ASUS ROG Strix X670E-I Gaming WiFi
Cooling ID-COOLING SE-207-XT Slim Snow
Memory TEAMGROUP T-Force Delta RGB 2x16GB DDR5-6000 CL30
Video Card(s) ASRock Radeon RX 7900 XTX 24 GB GDDR6 (MBA)
Storage 2TB Samsung 990 Pro NVMe
Display(s) AOpen Fire Legend 24" (25XV2Q), Dough Spectrum One 27" (Glossy), LG C4 42" (OLED42C4PUA)
Case ASUS Prime AP201 33L White
Audio Device(s) Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Cloud Alpha Wireless
Power Supply Corsair SF1000L
Mouse Logitech Pro Superlight (White), G303 Shroud Edition
Keyboard Wooting 60HE / NuPhy Air75 v2
VR HMD Occulus Quest 2 128GB
Software Windows 11 Pro 64-bit 23H2 Build 22631.3447
NVIDIA offering CUDA free of charge is a move of desperation. The more developers support it the better overall for NVIDIA. And how do you attract new developers and corporations to CUDA? By giving it away free. Good move by NVIDIA finally, but I don’t see AMD and Intel jumping in.

It's not really a move of desperation considering that most of OpenCL's (1.2) functions are actually branched off CUDA. That's why it's super simple to convert from CUDA to OpenCL and vice-versa. The new Context functions and Directives are exactly the same from CUDA 4.0.

AMD (not really Intel) need to step up their game since Stream is not going anywhere at all.
 
Top