Thursday, November 14th 2013

NVIDIA Dramatically Simplifies Parallel Programming With CUDA 6

NVIDIA today announced NVIDIA CUDA 6, the latest version of the world's most pervasive parallel computing platform and programming model.

The CUDA 6 platform makes parallel programming easier than ever, enabling software developers to dramatically decrease the time and effort required to accelerate their scientific, engineering, enterprise and other applications with GPUs.

It offers new performance enhancements that enable developers to instantly accelerate applications up to 8X by simply replacing existing CPU-based libraries. Key features of CUDA 6 include:
  • Unified Memory -- Simplifies programming by enabling applications to access CPU and GPU memory without the need to manually copy data from one to the other, and makes it easier to add support for GPU acceleration in a wide range of programming languages.
  • Drop-in Libraries -- Automatically accelerates applications' BLAS and FFTW calculations by up to 8X by simply replacing the existing CPU libraries with the GPU-accelerated equivalents.
  • Multi-GPU Scaling -- Re-designed BLAS and FFT GPU libraries automatically scale performance across up to eight GPUs in a single node, delivering over nine teraflops of double precision performance per node, and supporting larger workloads than ever before (up to 512 GB). Multi-GPU scaling can also be used with the new BLAS drop-in library.
"By automatically handling data management, Unified Memory enables us to quickly prototype kernels running on the GPU and reduces code complexity, cutting development time by up to 50 percent," said Rob Hoekstra, manager of Scalable Algorithms Department at Sandia National Laboratories. "Having this capability will be very useful as we determine future programming model choices and port more sophisticated, larger codes to GPUs."

"Our technologies have helped major studios, game developers and animators create visually stunning 3D animations and effects," said Paul Doyle, CEO at Fabric Engine, Inc. "They have been urging us to add support for acceleration on NVIDIA GPUs, but memory management proved too difficult a challenge when dealing with the complex use cases in production. With Unified Memory, this is handled automatically, allowing the Fabric compiler to target NVIDIA GPUs and enabling our customers to run their applications up to 10X faster."

In addition to the new features, the CUDA 6 platform offers a full suite of programming tools, GPU-accelerated math libraries, documentation and programming guides.

Version 6 of the CUDA Toolkit is expected to be available in early 2014. Members of the CUDA-GPU Computing Registered Developer Program will be notified when it is available for download. To join the program, register here.

For more information about the CUDA 6 platform, visit NVIDIA booth 613 at SC13, Nov. 18-21 in Denver, and the NVIDIA CUDA website.
Add your own comment

48 Comments on NVIDIA Dramatically Simplifies Parallel Programming With CUDA 6

#2
Recus
by: DaedalusHelios
So they used AMD's solutions in a movie.... that doesn't bring anything to this thread.....
I bet he hasn't seen the movie and "effects".
Posted on Reply
#3
theoneandonlymrk
by: radrok
Many people don't realize how much CUDA is getting radicated into professional software, I suggest you to take a look at CUDA developer zone before crapping into threads with consequent humiliation by showing how much clueless you are.

CUDA can be used and shown, AMDs implementations are just on paper so I don't get how people can draw conclusions lol.
cuda works fine i agree but your assertion that Amd's are only on paper is just ridiculous, have you not heard of open cl , Ive folded on amd gpu's for years and they are also fully opengl cl and direct compute compatible.
Posted on Reply
#4
Patriot
by: Recus
More layoffs coming to AMD

Just quick reply for games:




Mantle is coming to star citizen as well...
No one platform is being targeted he wants the game to use everything availible in pc performance...
Posted on Reply
#5
radrok
by: theoneandonlymrk
cuda works fine i agree but your assertion that Amd's are only on paper is just ridiculous, have you not heard of open cl , Ive folded on amd gpu's for years and they are also fully opengl cl and direct compute compatible.
You are talking like OpenCL is something AMD brought to the table.
Posted on Reply
#6
theoneandonlymrk
by: radrok
You are talking like OpenCL is something AMD brought to the table.
how so, i said Amd is compatible, exactly.

and implied it was useable to the same ends as cuda but went no where near what your saying, open< CL hello.
Posted on Reply
#8
nem
i m does not surprise me that to-morrow Mantle could be the inspiration of Microsoft for make Direcx12, as well as the tesselation was to DirecX11 :D



i know my inglish is a shit :pimp:
Posted on Reply
#10
Serpent of Darkness
by: Recus


Just quick reply for games:
"EverQuest Next will use multiple PhysX SDK and APEX features, as well as support GPU physics acceleration. "

http://s15.postimg.org/w3yli1v6j/citizen.jpg
Still waiting for you to provide the NVidia PowerPoint Slide on EQN. Last I heard, EQN is using Microsoft Havoc. No point in having PhysX and APEX. It's proprietary, in some games it creates more problems just to use it (i.e. Planetside 2), and it drops performance.

CynicalCyanide, a founding VIP member of RIS, said the following quote listed on the RIS Forums titled "GUIDE TO BUYING PC GAMING HARDWARE," under GPU NOTES.

"#3 GPU Compute: The AMD cards slaughter Nvidia Kepler cards for most GPU computing. This probably doesn’t matter to you, but if you use your GPU for OpenCL, Bitcoin mining etc, AMD is the clear winner here." Cyanide.

Source:
CynicalCyanide, GUIDE TO BUYING PC GAMING HARDWARE, Roberts Space Industries, Jan 27 2013, Nov 11 2013, https://forums.robertsspaceindustries.com/discussion/15249/guide-to-buying-pc-gaming-hardware.

Point is this. He doesn't "directly" state that Star Citizens is optimized for AMD Graphic Card-users, but he indicates that AMD, paraphrasing his words, performs better than NVidia Kepler in the computing department. This is a discussion on what's the best, ideal hardware with respect to the upcoming game. Assuming this is the universal consensus amongst all members of RIS, I think it's safe to say that they will probably lean more towards AMD, but will take the needs of NVidia users into consideration when the MMO goes live. PhysX will do absolutely nothing for the game. So NVidia users will only benefit from NVidia products, with respect to the game, just on a GPU Computing scenario. It won't be nothing more, and G-Sync may help with performance, but I'm not putting a lot of faith in that. On the other hand, Cyanide does state that NVidia would be more ideal for Surround. I won't disagree with that. I can see Star Citizens being a big name MMO if they push massive flight-battles with massive battleships and carriers, but multiple players would need to control them. It's taking MMO extremes to the next level.
Posted on Reply
#11
Recus
by: Serpent of Darkness
Still waiting for you to provide the NVidia PowerPoint Slide on EQN. Last I heard, EQN is using Microsoft Havoc. No point in having PhysX and APEX. It's proprietary, in some games it creates more problems just to use it (i.e. Planetside 2), and it drops performance.

CynicalCyanide, a founding VIP member of RIS, said the following quote listed on the RIS Forums titled "GUIDE TO BUYING PC GAMING HARDWARE," under GPU NOTES.

"#3 GPU Compute: The AMD cards slaughter Nvidia Kepler cards for most GPU computing. This probably doesn’t matter to you, but if you use your GPU for OpenCL, Bitcoin mining etc, AMD is the clear winner here." Cyanide.

Source:
CynicalCyanide, GUIDE TO BUYING PC GAMING HARDWARE, Roberts Space Industries, Jan 27 2013, Nov 11 2013, https://forums.robertsspaceindustries.com/discussion/15249/guide-to-buying-pc-gaming-hardware.

Point is this. He doesn't "directly" state that Star Citizens is optimized for AMD Graphic Card-users, but he indicates that AMD, paraphrasing his words, performs better than NVidia Kepler in the computing department. This is a discussion on what's the best, ideal hardware with respect to the upcoming game. Assuming this is the universal consensus amongst all members of RIS, I think it's safe to say that they will probably lean more towards AMD, but will take the needs of NVidia users into consideration when the MMO goes live. PhysX will do absolutely nothing for the game. So NVidia users will only benefit from NVidia products, with respect to the game, just on a GPU Computing scenario. It won't be nothing more, and G-Sync may help with performance, but I'm not putting a lot of faith in that. On the other hand, Cyanide does state that NVidia would be more ideal for Surround. I won't disagree with that. I can see Star Citizens being a big name MMO if they push massive flight-battles with massive battleships and carriers, but multiple players would need to control them. It's taking MMO extremes to the next level.
Posted on Reply
#12
BiggieShady
by: Death Star
OpenCL > CUDA.
by: Cheeseball
This would make sense if CUDA wasn't portable to OpenCL, but then again, it is. If anything, CUDA = OpenCL + OpenCL "Extensions", just like how OpenGL Extensions is.

The only problem is that AMD cards work better with float4 which is not the best case in many GPGPU applications.
This. People don't know how similar CUDA and OpenCL are, and how portable. The biggest mindfuck is difference in terminology:

[TABLE]C for CUDA terminology OpenCL terminology
Thread Work-item
Thread block Work-group
Global memory Global memory
Constant memory Constant memory
Shared memory Local memory
Local memory Private memory[/TABLE]
Porting your CUDA applications to OpenCL™ is often simply a matter of finding the equivalent syntax for various keywords and built-in functions in your kernel. You also need to convert your runtime API calls to the equivalent calls in OpenCL™
That's it, almost automatic converter could be done.
Posted on Reply
#13
Fluffmeister
Another reason not to hate on CUDA frankly. Presumably nVidia actually offer ongoing support into the bargain too.
Posted on Reply
#14
radrok
The only reason to hate CUDA is because of it being proprietary, other than that it's brilliant.

Way easier than OpenCL imho.
Posted on Reply
#15
Fluffmeister
The reason it's brilliant is because it is proprietary, companies are nothing without their IP, although it seems to be accepted that they should share everything on this forum.

Proprietary is used like some dirty buzz word people like to sling around as it suits them, people need to get real and understand how big business operates.

Of course, if any genius here has some great money making ideas, feel free to share them with me first, I'm all for piggy backing and taking advantage of others hard work.
Posted on Reply
#16
radrok
Dude, proprietary hurts the collectivity because in this particular situation it helps a company in establishing a monopoly.

I could NOT care less about nvidias business, what I care about is a healthy market with good competition.

So effin yes, proprietary is a con in this case.
Posted on Reply
#17
Fluffmeister
by: radrok
Dude, proprietary hurts the collectivity because in this particular situation it helps a company in establishing a monopoly.

I could care less about nvidias business, what I care about is a healthy market with good competition.

So effin yes, proprietary is a con in this case.
Luckily for you AMD have put their weight behind OpenCL, which probably explains why it lags behind.

I bet you can't sleep at night with all these evil monopolies out there.

Again, any great money making ideas, let me know.
Posted on Reply
#18
radrok
by: Fluffmeister
Luckily for you AMD have put their weight behind OpenCL, which probably explains why it lags behind.

I bet you can't sleep at night with all these evil monopolies out there.

Again, any great money making ideas, let me know.
Chances are if you didn't come up with the idea to begin with you wouldn't be able to make a profit with it even spoon fed and considering how you are grasping the core of this conversation makes my point.

Don't worry about my sleeping schedule, stay classy instead.
Posted on Reply
#19
HumanSmoke
by: radrok
The only reason to hate CUDA is because of it being proprietary, other than that it's brilliant.
Catch-22 situation. People hate proprietary, but proprietary tech moves faster from the development to implementation stage. It has better funding, a more cohesive ecosystem-application/hardware/utilities/marketing- and organized cadence between the facets
Open source by it's very definition has a protracted review and ratification timeline as with anything "designed by committee".

OpenCL would be a prime example. How long between revisions...a year and a half between 1.0 and 1.1 and another year and a half between 1.1 and 1.2? How quickly to evolve from a concept to widespread uptake...five years plus?
Without CUDA, where would we be with GPGPU ? GPU based parallelized computing may be chic in 2013, but that wouldn't have helped Nvidia stay solvent via the prosumer/ws/hpc markets back when the G80 was introduced...and without the revenue from the pro markets being ploughed back into R&D, it isn't much of a stretch to think that without Nvidia creating a market, AMD might be the only show in town for all intents and purposes.
We would likely be closer to a monopoly situation (with AMD's discrete graphics) had Nvidia not poured cash into CUDA in 2004.
Posted on Reply
#20
ViperXTR
hmm, from what ive read, it still copies from system to gpu or vice versa in the memory space yes? it just removes the manual copying and was automated?
Posted on Reply
#21
BiggieShady
by: ViperXTR
hmm, from what ive read, it still copies from system to gpu or vice versa in the memory space yes? it just removes the manual copying and was automated?
Yes, memory buffer is automatically copied over the PCI-E, this only simplifies coding.
Posted on Reply
#22
Serpent of Darkness
by: Recus
http://s21.postimg.org/8ytp6nlw7/Capture.png
You're putting your faith and trust in Wikipedia. LOL Epic!

I wouldn't put a lot of faith into that. Rumors I've heard is that EQN will be on a DX11.0 API, and it won't be Client-Based like PS2. PS2 is currently DX9.0 and client based. Ultra Mode almost pushes 10 GBs System Ram. In addition, it's debatable to say that NVidia hasn't put a lot of attention on it since it is "one" of their optimized games, but it looks a lot purty-er and better on AMD Cards. The use of any SweetFX injectors could get you banned from PS2...

Let's say it's true, and it probably is true (Forgelite Engine:D3D9.0, PhysX, NVidia Optimized, "probably" client-based)... They do use NVidia for Everquest Next. That still doesn't speak highly about ForgeLite or EQN. Why. Well for instance, take a look at PS2's OMFG Patch. OMFG =/ Oh My F***en Gawd Patch. OMFG = Oh Make Faster Game - Patch. A lot of NVidia users are still having issues, and the PS2 Devs have disabled PHYSX Particle Effects so they can work out the issues "further." Here's my point. Yet again, if EQN follows the same or similar trend, are you going to argue to me, in the first year of EQN, that it won't be proportionate in some way to PS2's derps and fails? Answer no and your trolling, but if you answer yes, some points might be valid. Other points could up for debate. That's the likely outcome. I am doubting AMD users will have as much headaches as NVidia users if that's the case. It's a scenario that's plausible.

I'll admit error on my end about EQN. There's no shame in that. I think it's un-necessary to debate any further whether you were wrong about Star Citizens. Still. Like a dumb Republican who can't answer question directly, I'm still waiting, from you, for that NVidia Power Point Slide that says EQN will be optimized on their products.
Posted on Reply
#23
CoD511
Nvidia is far from desperate I'd say just from a bare glance at them; they're seemingly not caring even in the slightest for the most part. Saying CUDA is useless due to not being open-source I find ridiculous considering how many companies and compute setups feature it and AMD have barely entered the compute wscrne. as of yet even honestly.

Also don't get why people are saying they're desperately cheapening thenselves after the Titan; it remains the same price and always will due to its unique position as an entry level compute card from Nvidia. The 780 near enough matched the 290X and the Ti was hardly desperation, I'd think it more of getting rid of silicon that didn't make the cut for a K40 and making a small profit perhaps in their proportionslly minor gaming branch... the near instantaneous launch shows it was in the works well before 290X release too... but sigh, not sure why I bother :P
Posted on Reply
Add your own comment