News Posts matching "CUDA"

Return to Keyword Browsing

NVIDIA Introduces RAPIDS Open-Source GPU-Acceleration Platform

NVIDIA today announced a GPU-acceleration platform for data science and machine learning, with broad adoption from industry leaders, that enables even the largest companies to analyze massive amounts of data and make accurate business predictions at unprecedented speed.

RAPIDS open-source software gives data scientists a giant performance boost as they address highly complex business challenges, such as predicting credit card fraud, forecasting retail inventory and understanding customer buying behavior. Reflecting the growing consensus about the GPU's importance in data analytics, an array of companies is supporting RAPIDS - from pioneers in the open-source community, such as Databricks and Anaconda, to tech leaders like Hewlett Packard Enterprise, IBM and Oracle.

VUDA is a CUDA-Like Programming Interface for GPU Compute on Vulkan (Open-Source)

GitHub developer jgbit has started an open-source project called VUDA, which takes inspiration from NVIDIA's CUDA API to bring an easily accessible GPU compute interface to the open-source world. VUDA is implemented as wrapper on top of the highly popular next-gen graphics API Vulkan, which provides low-level access to hardware. VUDA comes as header-only C++ library, which means it's compatible with all platforms that have a C++ compiler and that support Vulkan.

While the project is still young, its potential is enormous, especially due to the open source nature (using the MIT license). The page on GitHub comes with a (very basic) sample, that could be a good start for using the library.

Intel is Adding Vulkan Support to Their OpenCV Library, First Signs of Discrete GPU?

Intel has submitted the first patches with Vulkan support to their open-source OpenCV library, which is designed to accelerate Computer Vision. The library is widely used for real-time applications as it comes with 1st-class optimizations for Intel processors and multi-core x86 in general. With Vulkan support, existing users can immediately move their neural network workloads to the GPU compute space without having to rewrite their code base.

At this point in time, the Vulkan backend supports Convolution, Concat, ReLU, LRN, PriorBox, Softmax, MaxPooling, AvePooling, and Permute. According to the source code changes, this is just "a beginning work for Vulkan in OpenCV DNN, more layer types will be supported and performance tuning is on the way."

It seems that now, with their own GPU development underway, Intel has found new love for the GPU-accelerated compute space. The choice of Vulkan is also interesting as the API is available on a wide range of platforms, which could mean that Intel is trying to turn Vulkan into a CUDA killer. Of course there's still a lot of work needed to achieve that goal, since NVIDIA has had almost a decade of head start.

NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

The GeForce RTX 2080 Ti is indeed based on an ASIC codenamed "TU102." NVIDIA was referring to this 775 mm² chip when talking about the 18.5 billion-transistor count in its keynote. The company also provided a breakdown of its various "cores," and a block-diagram. The GPU is still laid out like its predecessors, but each of the 72 streaming multiprocessors (SMs) packs RT cores and Tensor cores in addition to CUDA cores.

The TU102 features six GPCs (graphics processing clusters), which each pack 12 SMs. Each SM packs 64 CUDA cores, 8 Tensor cores, and 1 RT core. Each GPC packs six geometry units. The GPU also packs 288 TMUs and 96 ROPs. The TU102 supports a 384-bit wide GDDR6 memory bus, supporting 14 Gbps memory. There are also two NVLink channels, which NVIDIA plans to later launch as its next-generation multi-GPU technology.

NVIDIA GeForce RTX 2000 Series Specifications Pieced Together

Later today (20th August), NVIDIA will formally unveil its GeForce RTX 2000 series consumer graphics cards. This marks a major change in the brand name, triggered with the introduction of the new RT Cores, specialized components that accelerate real-time ray-tracing, a task too taxing on conventional CUDA cores. Ray-tracing and DNN acceleration requires SIMD components to crunch 4x4x4 matrix multiplication, which is what RT cores (and tensor cores) specialize at. The chips still have CUDA cores for everything else. This generation also debuts the new GDDR6 memory standard, although unlike GeForce "Pascal," the new GeForce "Turing" won't see a doubling in memory sizes.

NVIDIA is expected to debut the generation with the new GeForce RTX 2080 later today, with market availability by end of Month. Going by older rumors, the company could launch the lower RTX 2070 and higher RTX 2080+ by late-September, and the mid-range RTX 2060 series in October. Apparently the high-end RTX 2080 Ti could come out sooner than expected, given that VideoCardz already has some of its specifications in hand. Not a lot is known about how "Turing" compares with "Volta" in performance, but given that the TITAN V comes with tensor cores that can [in theory] be re-purposed as RT cores; it could continue on as NVIDIA's halo SKU for the client-segment.

NVIDIA Releases GeForce 388.71 WHQL Drivers

NVIDIA today released the latest version of their GeForce software suite. Version 388.71 is a game-ready one, which brings the best performance profile for the phenomenon that is Player Unknown's BattleGrounds. For professionals, there's added support for CUDA 9.1, and Warframe SLI profiles have been updated. There are also many 3D Vision profiles that have been updated for this release, so make sure to check them out after the break, alongside other bug fixes and known issues.

As always, users can download these drivers right here on TechPowerUp. Just follow the link below.
DOWNLOAD: NVIDIA GeForce 388.71 WHQL

NVIDIA Announces TITAN V "Volta" Graphics Card

NVIDIA in a shock move, announced its new flagship graphics card, the TITAN V. This card implements the "Volta" GV100 graphics processor, the same one which drives the company's Tesla V100 HPC accelerator. The GV100 is a multi-chip module, with the GPU die and three HBM2 memory stacks sharing a package. The card features 12 GB of HBM2 memory across a 3072-bit wide memory interface. The GPU die has been built on the 12 nm FinFET+ process by TSMC. NVIDIA TITAN V maxes out the GV100 silicon, if not its memory interface, featuring a whopping 5,120 CUDA cores, 640 Tensor cores (specialized units that accelerate neural-net building/training). The CUDA cores are spread across 80 streaming multiprocessors (64 CUDA cores per SM), spread across 6 graphics processing clusters (GPCs). The TMU count is 320.

The GPU core is clocked at 1200 MHz, with a GPU Boost frequency of 1455 MHz, and an HBM2 memory clock of 850 MHz, translating into 652.8 GB/s memory bandwidth (1.70 Gbps stacks). The card draws power from a combination of 6-pin and 8-pin PCIe power connectors. Display outputs include three DP and one HDMI connectors. With a wallet-scorching price of USD $2,999, and available exclusively through NVIDIA store, the TITAN V is evidence that with Intel deciding to sell client-segment processors for $2,000, it was a matter of time before GPU makers seek out that price-band. At $3k, the GV100's margins are probably more than made up for.

NVIDIA Announces SaturnV AI Supercomputer Powered by "Volta"

NVIDIA at the Supercomputing 2017 conference announced a major upgrade of its new SaturnV AI supercomputer, which when complete, the company claims, will be not just one of the world's top-10 AI supercomputers in terms of raw compute power; but will also the world's most energy-efficient. The SaturnV will be a cluster supercomputer with 660 NVIDIA DGX-1 nodes. Each such node packs eight NVIDIA GV100 GPUs, which takes the machine's total GPU count to a staggering 5,280 (that's GPUs, not CUDA cores). They add up to an FP16 performance that's scraping the ExaFLOP (1,000-petaFLOP or 10^18 FLOP/s) barrier; while its FP64 (double-precision) compute performance nears 40 petaFLOP/s (40,000 TFLOP/s).

SaturnV should beat Summit, a supercomputer being co-developed by NVIDIA and IBM, which in turn should unseat Sunway TaihuLight, that's currently the world's fastest supercomputer. This feat gains prominence as NVIDIA SaturnV and NVIDIA+IBM Summit are both machines built by the American private-sector, which are trying to beat a supercomputing leader backed by the mighty Chinese exchequer. The other claim to fame of SaturnV is its energy-efficiency. Before its upgrade, SaturnV achieved an energy-efficiency of a staggering 15.1 GFLOP/s per Watt, which was already the fourth "greenest." NVIDIA expects the upgraded SaturnV to take the number-one spot.

25+ Companies Developing Level 5 Robotaxis on NVIDIA CUDA GPUs

NVIDIA today unveiled the world's first artificial intelligence computer designed to drive fully autonomous robotaxis. The new system, codenamed Pegasus, extends the NVIDIA DRIVE PX AI computing platform to handle Level 5 driverless vehicles. NVIDIA DRIVE PX Pegasus delivers over 320 trillion operations per second -- more than 10x the performance of its predecessor, NVIDIA DRIVE PX 2.

NVIDIA DRIVE PX Pegasus will help make possible a new class of vehicles that can operate without a driver -- fully autonomous vehicles without steering wheels, pedals or mirrors, and interiors that feel like a living room or office. They will arrive on demand to safely whisk passengers to their destinations, bringing mobility to everyone, including the elderly and disabled.

NVIDIA Announces the Tesla V100 PCI-Express HPC Accelerator

NVIDIA formally announced the PCI-Express add-on card version of its flagship Tesla V100 HPC accelerator, based on its next-generation "Volta" GPU architecture. Based on the advanced 12 nm "GV100" silicon, the GPU is a multi-chip module with a silicon substrate and four HBM2 memory stacks. It features a total of 5,120 CUDA cores, 640 Tensor cores (specialized CUDA cores which accelerate neural-net building), GPU clock speeds of around 1370 MHz, and a 4096-bit wide HBM2 memory interface, with 900 GB/s memory bandwidth. The 815 mm² GPU has a gargantuan transistor-count of 21 billion. NVIDIA is taking institutional orders for the V100 PCIe, and the card will be available a little later this year. HPE will develop three HPC rigs with the cards pre-installed.

NVIDIA Announces Its Volta-based Tesla V100

Today at its GTC keynote, NVIDIA CEO Jensen Huang took the wraps on some of the features on their upcoming V100 accelerator, the Volta-based accelerator for the professional market that will likely pave the way to the company's next-generation 2000 series GeForce graphics cards. If NVIDIA goes on with its product carvings and naming scheme for the next-generation Volta architecture, we can expect to see this processor on the company's next-generation GTX 2080 Ti. Running the nitty-gritty details (like the new Tensor processing approach) on this piece would be impossible, but there are some things we know already from this presentation.

This chip is a beast of a processor: it packs 21 billion transistors (up from 15,3 billion found on the P100); it's built on TSMC's 12 nm FF process (evolving from Pascal's 16 nm FF); and measures a staggering 815 mm² (from the P100's 610 mm².) This is such a considerable leap in die-area that we can only speculate on how yields will be for this monstrous chip, especially considering the novelty of the 12 nm process that it's going to leverage. But now, the most interesting details from a gaming perspective are the 5,120 CUDA cores powering the V100 out of a total possible 5,376 in the whole chip design, which NVIDIA will likely leave for their Titan Xv. These are divided in 84 Volta Streaming Multiprocessor Units with each carrying 64 CUDA cores (84 x 64 = 5,376, from which NVIDIA is cutting 4 Volta Streaming Multiprocessor Units for yields, most likely, which accounts for the announced 5,120.) Even in this cut-down configuration, we're looking at a staggering 42% higher pure CUDA core-count than the P100's. The new V100 will offer up to 15 FP 32 TFLOPS, and will still leverage a 16 GB HBM2 implementation delivering up to 900 GB/s bandwidth (up from the P100's 721 GB/s). No details on clock speed or TDP as of yet, but we already have enough details to enable a lengthy discussion... Wouldn't you agree?

NVIDIA Announces the TITAN Xp - Faster Than GTX 1080 Ti

NVIDIA GeForce GTX 1080 Ti cannibalized the TITAN X Pascal, and the company needed something faster to sell at USD $1,200. Without making much noise about it, the company launched the new TITAN Xp, and with it, discontinued the TITAN X Pascal. The new TITAN Xp features all 3,840 CUDA cores physically present on the "GP102" silicon, all 240 TMUs, all 96 ROPs, and 12 GB of faster 11.4 Gbps GDDR5X memory over the chip's full 384-bit wide memory interface.

Compare these to the 3,584 CUDA cores, 224 TMUs, 96 ROPs, and 10 Gbps GDDR5X memory of the TITAN X Pascal, and 3,584 CUDA cores, 224 TMUs, 88 ROPs, and 11 GB of 11 Gbps GDDR5X memory across a 352-bit memory bus, of the GTX 1080 Ti. The GPU Boost frequency is 1582 MHz. Here's the catch - the new TITAN Xp will be sold exclusively through GeForce.com, which means it will be available in very select markets where NVIDIA's online store has a presence.

NVIDIA Unveils New Line of Quadro Pascal GPUs

NVIDIA today introduced a range of Quadro products, all based on its Pascal architecture, that transform desktop workstations into supercomputers with breakthrough capabilities for professional workflows across many industries. Workflows in design, engineering and other areas are evolving rapidly to meet the exponential growth in data size and complexity that comes with photorealism, virtual reality and deep learning technologies. To tap into these opportunities, the new NVIDIA Quadro Pascal-based lineup provides an enterprise-grade visual computing platform that streamlines design and simulation workflows with up to twice the performance of the previous generation, and ultra-fast memory.

"Professional workflows are now infused with artificial intelligence, virtual reality and photorealism, creating new challenges for our most demanding users," said Bob Pette, vice president of Professional Visualization at NVIDIA. "Our new Quadro lineup provides the graphics and compute performance required to address these challenges. And, by unifying compute and design, the Quadro GP100 transforms the average desktop workstation with the power of a supercomputer."

AMD Radeon Technology Will Be Available on Google Cloud Platform in 2017

At SC16, AMD announced that Radeon GPU technology will be available to Google Cloud Platform users worldwide. Starting in 2017, Google will use AMD's fastest available single-precision dual GPU compute accelerators, Radeon-based AMD FirePro S9300 x2 Server GPUs, to help accelerate Google Compute Engine and Google Cloud Machine Learning services. AMD FirePro S9300 x2 GPUs can handle highly parallel calculations, including complex medical and financial simulations, seismic and subsurface exploration, machine learning, video rendering and transcoding, and scientific analysis. Google Cloud Platform will make the AMD GPU resources available for all their users around the world.

"Graphics processors represent the best combination of performance and programmability for existing and emerging big data applications," said Raja Koduri, senior vice president and chief architect, Radeon Technologies Group, AMD. "The adoption of AMD GPU technology in Google Cloud Platform is a validation of the progress AMD has made in GPU hardware and our Radeon Open Compute Platform, which is the only fully open source hyperscale GPU compute platform in the world today. We expect that our momentum in GPU computing will continue to accelerate with future hardware and software releases and advances in the ecosystem of middleware and libraries."

NVIDIA Announces Xavier, Volta-based Autonomous Transportation SoC

At its inaugural European edition of the Graphics Technology Conference (GTC), NVIDIA announced Xavier, an "AI supercomputer for the future of autonomous transportation." An evolution of its Drive PX2 board that leverages a pair of "Maxwell" GPUs with some custom logic and an ARM CPU, to provide cars with the compute power necessary to deep-learn the surroundings and self-drive, or assist-drive; Xavier is a refinement over Drive PX2 in that it merges three chips - two GPUs and one control logic into an SoC.

You'd think that NVIDIA refined its deep-learning tech enough to not need a pair of "Maxwell" SoCs, but Xavier is more than that. The 7 billion-transistor chip built on 16 nm FinFET process, offers more raw compute performance thanks to leveraging NVIDIA's next-generation "Volta" architecture, one more advanced than even its current "Pascal" architecture. The chip features a "Volta" GPU with 512 CUDA cores. The CVA makes up the vehicle I/O, while an image processor that's capable of 8K HDR video streams feeds the chip with visual inputs from various cameras around the vehicle. An 8-core ARM CPU performs general-purpose compute. NVIDIA hopes to get the first engineering samples of Xavier out to interested car-makers by Q4-2017.

NVIDIA Launches Maxed-out GP102 Based Quadro P6000

Late last week, NVIDIA announced the TITAN X Pascal, its fastest consumer graphics offering targeted at gamers and PC enthusiasts. The reign of TITAN X Pascal being the fastest single-GPU graphics card could be short-lived, as NVIDIA announced a Quadro product based on the same "GP102" silicon, which maxes out its on-die resources. The new Quadro P6000, announced at SIGGRAPH alongside the GP104-based Quadro P5000, features all 3,840 CUDA cores physically present on the chip.

Besides 3,840 CUDA cores, the P6000 features a maximum FP32 (single-precision floating point) performance of up to 12 TFLOP/s. The card also features 24 GB of GDDR5X memory, across the chip's 384-bit wide memory interface. The Quadro P5000, on the other hand, features 2,560 CUDA cores, up to 8.9 TFLOP/s FP32 performance, and 16 GB of GDDR5X memory across a 256-bit wide memory interface. It's interesting to note that neither cards feature full FP64 (double-precision) machinery, and that is cleverly relegated to NVIDIA's HPC product line, the Tesla P-series.

NVIDIA Announces the GeForce GTX TITAN X Pascal

In a show of shock and awe, NVIDIA today announced its flagship graphics card based on the "Pascal" architecture, the GeForce GTX TITAN X Pascal. Market availability of the card is scheduled for August 2, 2016, priced at US $1,199. Based on the 16 nm "GP102" silicon, this graphics card is endowed with 3,584 CUDA cores spread across 56 streaming multiprocessors, 224 TMUs, 96 ROPs, and a 384-bit GDDR5X memory interface, holding 12 GB of memory.

The core is clocked at 1417 MHz, with 1531 MHz GPU Boost, and 10 Gbps memory, churning out 480 GB/s of memory bandwidth. The card draws power from a combination of 6-pin and 8-pin PCIe power connectors, the GPU's TDP is rated at 250W. NVIDIA claims that the GTX TITAN X Pascal is up to 60 percent faster than the GTX TITAN X (Maxwell), and up to 3 times faster than the original GeForce GTX TITAN.

NVIDIA Announces the GeForce GTX 1060, 6 GB GDDR5, $249

NVIDIA today announced its third desktop consumer graphics card based on the "Pascal" architecture, the GeForce GTX 1060. NVIDIA aims to strike a price-performance sweetspot, by pricing this card aggressively at US $249 (MSRP), with its reference "Founders Edition" variant priced at $299. To make sure two of these cards at $500 don't cannibalize the $599-699 GTX 1080, NVIDIA didn't even give this card 2-way SLI support. Retail availability of the cards will commence from 19th July, 2016. NVIDIA claims that the GTX 1060 performs on-par with the GeForce GTX 980 from the previous generation.

The GeForce GTX 1060 is based on the new 16 nm "GP106" silicon, the company's third ASIC based on this architecture after GP100 and GP104. It features 1,280 CUDA cores spread across ten streaming multiprocessors, 80 TMUs, 48 ROPs, and a 192-bit wide GDDR5 memory interface, holding 6 GB of memory. The card draws power from a single 6-pin PCIe power connector, as the GPU's TDP is rated at just 120W. The core is clocked up to 1.70 GHz, and the memory at 8 Gbps, at which it belts out 192 GB/s of memory bandwidth. Display outputs include three DisplayPorts 1.4, one HDMI 2.0b, and a DVI.

NVIDIA to Unveil GeForce GTX TITAN P at Gamescom

NVIDIA is preparing to launch its flagship graphics card based on the "Pascal" architecture, the so-called GeForce GTX TITAN P, at the 2016 Gamescom, held in Cologne, Germany, between 17-21 August. The card is expected to be based on the GP100 silicon, and could likely come in two variants - 16 GB and 12 GB. The two differ by memory bus width besides memory size. The 16 GB variant could feature four HBM2 stacks over a 4096-bit memory bus; while the 12 GB variant could feature three HBM2 stacks, and a 3072-bit bus. This approach by NVIDIA is identical to the way it carved out Tesla P100-based PCIe accelerators, based on this ASIC. The cards' TDP could be rated between 300-375W, drawing power from two 8-pin PCIe power connectors.

The GP100 and GTX TITAN P isn't the only high-end graphics card lineup targeted at gamers and PC enthusiasts, NVIDIA is also working the GP102 silicon, positioned between the GP104 and the GP100. This chip could lack FP64 CUDA cores found on the GP100 silicon, and feature up to 3,840 CUDA cores of the same kind found on the GP104. The GP102 is also expected to feature simpler 384-bit GDDR5X memory. NVIDIA could base the GTX 1080 Ti on this chip.

NVIDIA Announces a PCI-Express Variant of its Tesla P100 HPC Accelerator

NVIDIA announced a PCI-Express add-on card variant of its Tesla P100 HPC accelerator, at the 2016 International Supercomputing Conference, held in Frankfurt, Germany. The card is about 30 cm long, 2-slot thick, and of standard height, and is designed for PCIe multi-slot servers. The company had introduced the Tesla P100 earlier this year in April, with a dense mezzanine form-factor variant for servers with NVLink.

The PCIe variant of the P100 offers slightly lower performance than the NVLink variant, because of lower clock speeds, although the core-configuration of the GP100 silicon remains unchanged. It offers FP64 (double-precision floating-point) performance of 4.70 TFLOP/s, FP32 (single-precision) performance of 9.30 TFLOP/s, and FP16 performance of 18.7 TFLOP/s, compared to the NVLink variant's 5.3 TFLOP/s, 10.6 TFLOP/s, and 21 TFLOP/s, respectively. The card comes in two sub-variants based on memory, there's a 16 GB variant with 720 GB/s memory bandwidth and 4 MB L3 cache, and a 12 GB variant with 548 GB/s and 3 MB L3 cache. Both sub-variants feature 3,584 CUDA cores based on the "Pascal" architecture, and core clock speed of 1300 MHz.

NVIDIA's Next Flagship Graphics Cards will be the GeForce X80 Series

With the GeForce GTX 900 series, NVIDIA has exhausted its GeForce GTX nomenclature, according to a sensational scoop from the rumor mill. Instead of going with the GTX 1000 series that has one digit too many, the company is turning the page on the GeForce GTX brand altogether. The company's next-generation high-end graphics card series will be the GeForce X80 series. Based on the performance-segment "GP104" and high-end "GP100" chips, the GeForce X80 series will consist of the performance-segment GeForce X80, the high-end GeForce X80 Ti, and the enthusiast-segment GeForce X80 TITAN.

Based on the "Pascal" architecture, the GP104 silicon is expected to feature as many as 4,096 CUDA cores. It will also feature 256 TMUs, 128 ROPs, and a GDDR5X memory interface, with 384 GB/s memory bandwidth. 6 GB could be the standard memory amount. Its texture- and pixel-fillrates are rated to be 33% higher than those of the GM200-based GeForce GTX TITAN X. The GP104 chip will be built on the 16 nm FinFET process. The TDP of this chip is rated at 175W.

NVIDIA Coming Around to Vulkan Support

NVIDIA is preparing to add support for Vulkan, the upcoming 3D graphics API by Khronos, and successor to OpenGL, to its feature-set. The company's upcoming GeForce 358.66 series driver will introduce support for Vulkan. What makes matters particularly interesting is the API itself. Vulkan is heavily based on AMD's Mantle API, which the company gracefully retired in favor of DirectX 12, and committed its code to Khronos. The 358 series drivers also reportedly feature function declarations in their CUDA code for upcoming NVIDIA GPU architectures, such as Pascal and Volta.

BIOSTAR Announces Gaming H170T Mainboard and GeForce GTX 980 Ti Graphics Card

BIOSTAR's latest gaming products cover both the mainboard and VGA card spaces with two products that offer a sweet spot for performance versus investment in hardware. The motherboard is the Gaming H170T which is the best valued motherboard of the Gaming Skylake platform (other brands are at least $169 and up) and the GPU is the Gaming GeForce GTW980Ti, a 6GB GDDR5, 384bit, full size PCB, high-end 3D graphic solution, supporting NVIDIA's PhyX and CUDA technology at 4K resolutions.

BIOSTAR's latest mainboard, the Gaming H170T is based on Intel's H170 chipset which is a single-chip design that supports Intel 6th generation socket 1151 Intel Core processors. The Gaming H170T comes with Hi-Fi Technology built inside, delivering Blu-Ray audio, Puro Hi-Fi features an integrated independent audio power design with a built-in amplifier. The technology utilizes audio components with an independent power delivery design for a significant reduction in electronic noise producing superb sound quality. Moreover, the Hi-Fi H170T supports USB 3.0, PCIe M.2 (32Gb/s), SATA Express (16Gb/s) and has DisplayPort for monitor output.

NVIDIA Preparing a dual-GM200 Graphics Card

If it could make a dual-GK110 graphics card, the forgettable $2,999 GTX TITAN-Z, it's only conceivable that NVIDIA could launch one based on its newer and slightly more energy-efficient GM200 chips. According to a WCCFTech report, the company is doing just that. The dual-GPU GM200 graphics card could bear the company's coveted "GTX TITAN" branding, and could be a doubling of the GTX TITAN X, with twice as many CUDA cores (6,144 in all), TMUs (384 in all), ROPs (192 in all), and memory (24 GB in all), spread across two GPU systems, in an SLI-on-a-stick solution. It remains to be seen if NVIDIA gets the pricing wrong the second time.

NVIDIA Doubles Performance for Deep Learning Training

NVIDIA today announced updates to its GPU-accelerated deep learning software that will double deep learning training performance. The new software will empower data scientists and researchers to supercharge their deep learning projects and product development work by creating more accurate neural networks through faster model training and more sophisticated model design.

The NVIDIA DIGITS Deep Learning GPU Training System version 2 (DIGITS 2) and NVIDIA CUDA Deep Neural Network library version 3 (cuDNN 3) provide significant performance enhancements and new capabilities. For data scientists, DIGITS 2 now delivers automatic scaling of neural network training across multiple high-performance GPUs. This can double the speed of deep neural network training for image classification compared to a single GPU.
Return to Keyword Browsing