• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

NVIDIA Releases CUDA ToolKit Version 2.2

malware

New Member
Joined
Nov 7, 2004
Messages
5,422 (0.72/day)
Location
Bulgaria
Processor Intel Core 2 Quad Q6600 G0 VID: 1.2125
Motherboard GIGABYTE GA-P35-DS3P rev.2.0
Cooling Thermalright Ultra-120 eXtreme + Noctua NF-S12 Fan
Memory 4x1 GB PQI DDR2 PC2-6400
Video Card(s) Colorful iGame Radeon HD 4890 1 GB GDDR5
Storage 2x 500 GB Seagate Barracuda 7200.11 32 MB RAID0
Display(s) BenQ G2400W 24-inch WideScreen LCD
Case Cooler Master COSMOS RC-1000 (sold), Cooler Master HAF-932 (delivered)
Audio Device(s) Creative X-Fi XtremeMusic + Logitech Z-5500 Digital THX
Power Supply Chieftec CFT-1000G-DF 1kW
Software Laptop: Lenovo 3000 N200 C2DT2310/3GB/120GB/GF7300/15.4"/Razer
NVIDIA announced today it has released version 2.2 of the CUDA Toolkit and SDK for GPU Computing. This latest release supports several significant new features that deliver a major leap forward in getting the most performance out of NVIDIA's massively parallel CUDA-enabled GPUs. In addition, version 2.2 of the CUDA Toolkit includes support for Windows 7, the upcoming OS from Microsoft that embraces GPU Computing.
Developers can download the latest CUDA Toolkit, SDK, and drivers now from here.


Additional new features in CUDA Toolkit 2.2 include:
  • Visual Profiler for the GPU
    The most common step in tuning application performance is profiling the application and then modifying the code. The CUDA Visual Profiler is a graphical tool that enables the profiling of C applications running on the GPU. This latest release of the CUDA Visual Profiler includes metrics for memory transactions, giving developers visibility into one of the most important areas they can tune to get better performance.
  • Improved OpenGL Interop
    Delivers improved performance for Medical Imaging and other OpenGL applications running on Quadro GPUs when computing with CUDA and rendering OpenGL graphics functions are performed on different GPUs.
  • Texture from Pitch Linear Memory
    Delivers up to 2x bandwidth savings for video processing applications.
  • Zero-copy
    Enables streaming media, video transcoding, image processing and signal processing applications to realize significant performance improvements by allowing CUDA functions to read and write directly from pinned system memory. This reduces the frequency and amount of data copied back and forth between GPU and CPU memory. Supported on MCP7x and GT200 and later GPUs.
  • Pinned Shared Sysmem
    Enables applications that use multiple GPUs to achieve better performance and use less total system memory by allowing multiple GPUs to access the same data in system memory. Typical multi-GPU systems include Tesla servers, Tesla Personal Supercomputers, workstations using QuadroPlex deskside units and consumer systems with multiple GPUs.
  • Asynchronous memcopy on Vista
    Allows applications to realize significant performance improvements by copying memory asynchronously. This feature was already available on other supported platforms but is now available on Vista.
  • Hardware Debugger for the GPU
    Developers can now use a hardware level debugger on CUDA-enabled GPUs that offers the simplicity of the popular open-source GDB debugger yet enables a developer to easily debug a program that is running 1000s of threads on the GPU. This CUDA GDB debugger for Linux has all the features required to debug directly on the GPU, including the ability to set breakpoints, watch variables, inspect state, etc.
  • Exclusive Device Mode
    This system configuration option allows an application to get exclusive use of a GPU, guaranteeing that 100% of the processing power and memory of the GPU will be dedicated to that application. Multiple applications can still be run concurrently on the system, but only one application can make use of each GPU at a time. This configuration is particularly useful on Tesla cluster systems where large applications may require dedicated use of one or more GPUs on each node of a Linux cluster.

View at TechPowerUp Main Site
 
mmmm, nvidia care nuch about CUDA this days
 
as it is already in use and close to be spread widely

ATI get on that train fast! your steam is just a bad joke as it's hardware access is way to high get it low and directly to speed up more!

sorry get angry because of all my ati laptops and their HD capabilities compared to Nvidia w/cuda
 
That's because AMD/Ati is waiting for OpenCL to hit final specs and they will use that instead of CAL/Brook.
 
yea, i read also that Nvidia and Microsoft have been butt-buddies through the making of windows 7, so i'm betting there's going to be great compatibility with CUDA and applications that utilize it.
 
Then it's a good thing OpenCL is going to be cross-platform and isn't going to be OS biased.
 
CUDA already has support for OpenCL and DirectX Compute, so there shouldn't be a problem in the general purpose stream processing department for NVIDIA anymore.
 
Back
Top