News Posts matching #CUDA

Return to Keyword Browsing

NVIDIA and Atos Team Up to Build World's Fastest AI Supercomputer

NVIDIA today announced that the Italian inter-university consortium CINECA—one of the world's most important supercomputing centers—will use the company's accelerated computing platform to build the world's fastest AI supercomputer.

The new "Leonardo" system, built with Atos, is expected to deliver 10 exaflops of FP16 AI performance to enable advanced AI and HPC converged application use cases. Featuring nearly 14,000 NVIDIA Ampere architecture-based GPUs and NVIDIA Mellanox HDR 200 Gb/s InfiniBand networking, Leonardo will propel Italy as the global leader in AI and high performance computing research and innovation.

NVIDIA Unveils RTX A6000 "Ampere" Professional Graphics Card and A40 vGPU

NVIDIA today unveiled its RTX A6000 professional graphics card, the first professional visualization-segment product based on its "Ampere" graphics architecture. With this, the company appears to be deviating from the Quadro brand for the graphics card, while several software-side features retain the brand. The card is based on the same 8 nm "GA102" silicon as the GeForce RTX 3080, but configured differently. For starters, it gets a mammoth 48 GB of GDDR6 memory across the chip's 384-bit wide memory interface, along with ECC support.

The company did not reveal the GPU's CUDA core count, but mentioned that the card's typical board power is 300 W. The card also gets NVLink support, letting you pair up to two A6000 cards for explicit multi-GPU. It also supports GPU virtualization, including NVIDIA GRID, NVIDIA Quadro Virtual Data Center Workstation, and NVIDIA Virtual Compute Server. The card features a conventional lateral blower-type cooling solution, and its most fascinating aspect is its power input configuration, with just the one 8-pin EPS power input. We will update this story with more information as it trickles out.
Update 13:37 UTC: The company also unveiled the A40, a headless professional-visualization graphics card dedicated for virtual-GPU/cloud-GPU applications (deployments at scale in data-centers). The card has similar specs to the RTX A6000.

Update 13:42 UTC: NVIDIA website says that both the A40 and RTX A6000 a 4+4 pin EPS connector (and not 8-pin PCIe) for power input. An 8-pin EPS connector is capable of delivering up to 336 W (4x 7 A @ 12 V).

Folding @ Home Bakes in NVIDIA CUDA Support for Increased Performance

GPU Folders make up a huge fraction of the number-crunching power of Folding@home, enabling us to help projects like the COVID Moonshot open science drug discovery project evaluate thousands of molecules per week in their quest to produce a new low-cost patent-free therapy for COVID-19. The COVID Moonshot (@covid_moonshot) is using the number-crunching power of Folding@home to evaluate thousands of molecules per week, synthesizing hundreds of these molecules in their quest to develop a patent-free drug for COVID-19 that could be taken as a simple 2x/day pill.

As of today, your folding GPUs just got a big powerup! Thanks to NVIDIA engineers, our Folding@home GPU cores—based on the open source OpenMM toolkit—are now CUDA-enabled, allowing you to run GPU projects significantly faster. Typical GPUs will see 15-30% speedups on most Folding@home projects, drastically increasing both science throughput and points per day (PPD) these GPUs will generate.

Editor's Note:TechPowerUp features a strong community surrounding the Folding @ Home project. Remember to fold aggregated to the TPU team, if you so wish: we're currently 44# in the world, but have plans for complete world domination. You just have to input 50711 as your team ID. This is a way to donate efforts to cure various diseases affecting humanity that's at the reach of a few computer clicks - and the associated power cost with these computations.

NVIDIA RTX IO Detailed: GPU-assisted Storage Stack Here to Stay Until CPU Core-counts Rise

NVIDIA at its GeForce "Ampere" launch event announced the RTX IO technology. Storage is the weakest link in a modern computer, from a performance standpoint, and SSDs have had a transformational impact. With modern SSDs leveraging PCIe, consumer storage speeds are now bound to grow with each new PCIe generation doubling per-lane IO bandwidth. PCI-Express Gen 4 enables 64 Gbps bandwidth per direction on M.2 NVMe SSDs, AMD has already implemented it across its Ryzen desktop platform, Intel has it on its latest mobile platforms, and is expected to bring it to its desktop platform with "Rocket Lake." While more storage bandwidth is always welcome, the storage processing stack (the task of processing ones and zeroes to the physical layer), is still handled by the CPU. With rise in storage bandwidth, the IO load on the CPU rises proportionally, to a point where it can begin to impact performance. Microsoft sought to address this emerging challenge with the DirectStorage API, but NVIDIA wants to build on this.

According to tests by NVIDIA, reading uncompressed data from an SSD at 7 GB/s (typical max sequential read speeds of client-segment PCIe Gen 4 M.2 NVMe SSDs), requires the full utilization of two CPU cores. The OS typically spreads this workload across all available CPU cores/threads on a modern multi-core CPU. Things change dramatically when compressed data (such as game resources) are being read, in a gaming scenario, with a high number of IO requests. Modern AAA games have hundreds of thousands of individual resources crammed into compressed resource-pack files.

NVIDIA GeForce RTX 3090 and 3080 Specifications Leaked

Just ahead of the September launch, specifications of NVIDIA's upcoming RTX Ampere lineup have been leaked by industry sources over at VideoCardz. According to the website, three alleged GeForce SKUs are being launched in September - RTX 3090, RTX 3080, and RTX 3070. The new lineup features major improvements: 2nd generation ray-tracing cores and 3rd generation tensor cores made for AI and ML. When it comes to connectivity and I/O, the new cards use the PCIe 4.0 interface and have support for the latest display outputs like HDMI 2.1 and DisplayPort 1.4a.

The GeForce RTX 3090 comes with 24 GB of GDDR6X memory running on a 384-bit bus at 19.5 Gbps. This gives a memory bandwidth capacity of 936 GB/s. The card features the GA102-300 GPU with 5,248 CUDA cores running at 1695 MHz, and is rated for 350 W TGP (board power). While the Founders Edition cards will use NVIDIA's new 12-pin power connector, non-Founders Edition cards, from board partners like ASUS, MSI and Gigabyte, will be powered by two 8-pin connectors. Next up is specs for the GeForce RTX 3080, a GA102-200 based card that has 4,352 CUDA cores running at 1710 MHz, paired with 10 GB of GDDR6X memory running at 19 Gbps. The memory is connected with a 320-bit bus that achieves 760 GB/s bandwidth. The board is rated at 320 W and the card is designed to be powered by dual 8-pin connectors. And finally, there is the GeForce RTX 3070, which is built around the GA104-300 GPU with a yet unknown number of CUDA cores. We only know that it has the older non-X GDDR6 memory that runs at 16 Gbps speed on a 256-bit bus. The GPUs are supposedly manufactured on TSMC's 7 nm process, possibly the EUV variant.

NVIDIA Announces GTC 2020 Keynote to be Held on October 5-9

NVIDIA today announced that it will be hosting another GTC keynote for the coming month of October. To be held between October 5th and October 9th, the now announced keynote will bring updates to NVIDIA's products and technologies, as well as provide an opportunity for numerous computer science companies and individuals to take center stage on discussing new and upcoming technologies. More than 500 sessions will form the backbone of GTC, with seven separate programming streams running across North America, Europe, Israel, India, Taiwan, Japan and Korea - each with access to live demos, specialized content, local startups and sponsors.

This GTC keynote follows the May 2020 keynote where the world was presented to NVIDIA's Ampere-based GA100 accelerator. A gaming and consumer-oriented event is also taking place on September 1st, with expectations being set high for NVIDIA's next-generation of consumer graphics products. Although if recent rumors of a $2,000 RTX 3090 graphics card are anything to go by, not only expectations will be soaring by then.

Dynics Announces AI-enabled Vision System Powered by NVIDIA T4 Tensor Core GPU

Dynics, Inc., a U.S.-based manufacturer of industrial-grade computer hardware, visualization software, network security, network monitoring and software-defined networking solutions, today announced the XiT4 Inference Server, which helps industrial manufacturing companies increase their yield and provide more consistent manufacturing quality.

Artificial intelligence (AI) is increasingly being integrated into modern manufacturing to improve and automate processes, including 3D vision applications. The XiT4 Inference Server, powered by the NVIDIA T4 Tensor Core GPUs, is a fan-less hardware platform for AI, machine learning and 3D vision applications. AI technology is allowing manufacturers to increase efficiency and throughput of their production, while also providing more consistent quality due to higher accuracy and repeatability. Additional benefits are fewer false negatives (test escapes) and fewer false positives, which reduce downstream re-inspection needs, all leading to lower costs of manufacturing.

GALAX Designs a GeForce GTX 1650 "Ultra" with TU106 Silicon

NVIDIA board partners carving out GeForce RTX 20-series and GTX 16-series SKUs from ASICs they weren't originally based on, is becoming more common, but GALAX has taken things a step further. The company just launched a GeForce GTX 1650 (GDDR6) graphics card based on the "TU106" silicon (ASIC code: TU106-125-A1). The company carved a GTX 1650 out of this chip by disabling all of its RT cores, all its tensor cores, and a whopping 61% of its CUDA cores, along with proportionate reductions in TMU- and ROP counts. The memory bus width has been halved from 256-bit down to 128-bit.

The card, however, is only listed by the Chinese regional arm of GALAX. The card's marketing name is "GALAX GeForce GTX 1650 Ultra," with "Ultra" being a GALAX brand extension, and not an NVIDIA SKU (i.e. the GPU isn't called "GTX 1650 Ultra"). The GPU clock speeds for this card is identical to those of the original GTX 1650 that's based on TU117 - 1410 MHz base, 1590 MHz GPU Boost, and 12 Gbps (GDDR6-effective) memory.

Aetina Launches New Edge AI Computer Powered by the NVIDIA Jetson

Aetina Corp., a provider of high-performance GPGPU solutions, announced the new AN110-XNX edge AI computer leveraging the powerful capabilities of the NVIDIA Jetson Xavier NX, expanding its range of edge AI systems built on the Jetson platform for applications in smart transportation, factories, retail, healthcare, AIoT, robotics, and more.

The AN110-XNX combines the NVIDIA Jetson Xavier NX and Aetina AN110 carrier board in a compact form factor of 87.4 x 68.2 x 52 mm (with fan). AN110-XNX supports the MIPI CSI-2 interface for 1x4k or 2xFHD cameras to handle intensive AI workloads from ultra-high-resolution cameras to more accurate image analysis. It is as small as Aetina's AN110-NAO based on the NVIDIA Jetson Nano platform, but delivers more powerful AI computing via the new Jetson Xavier NX. With 384 CUDA cores, 48 Tensor Cores, and cloud-native capability the Jetson Xavier NX delivers up to 21 TOPS and is the ideal platform to accelerate AI applications. Bundled with the latest NVIDIA Jetpack 4.4 SDK, the energy-efficient module significantly expands the choices now available for developers and customers looking for embedded edge-computing options that demand increased performance to support AI workloads but are constrained by size, weight, power budget, or cost.

DirectX Coming to Linux...Sort of

Microsoft is preparing to add the DirectX API support to WSL (Windows Subsystem for Linux). The latest Windows Subsystem for Linux 2 will virtualize DirectX to Linux applications running on top of it. WSL is a translation layer for Linux apps to run on top of Windows. Unlike Wine, which attempts to translate Direct3D commands to OpenGL, what Microsoft is proposing is a real DirectX interface for apps in WSL, which can essentially talk to hardware (the host's kernel-mode GPU driver) directly.

To this effect, Microsoft introduced the Linux-edition of DXGkrnl, a new kernel-mode driver for Linux that talks to the DXGkrnl driver of the Windows host. With this, Microsoft is promising to expose the full Direct3D 12, DxCore, and DirectML. It will also serve as a conduit for third party APIs, such as OpenGL, OpenCL, Vulkan, and CUDA. Microsoft expects to release this feature-packed WSL out with WDDM 2.9 (so a future version of Windows 10).

AAEON Unveils AI and Edge Computing Solutions Powered by NVIDIA

AAEON, a leading developer of embedded AI and edge-computing solutions, today announced it is unveiling several new rugged embedded platforms—augmenting an already extensive lineup of AAEON AI edge-computing solutions powered by the NVIDIA Jetson platform. The new AAEON products provide key interfaces needed for edge computing in a small form factor, making it easier to build applications for all levels of users, from makers to more advanced developers for deployments in the field.

AAEON also introduced a new version of the popular BOXER-8120AI, now featuring the Jetson TX2 4 GB module, providing an efficient and cost-effective solution for AI edge computing with 256 CUDA cores delivering processing speeds up to 1.3 TFLOPS."Partnering with an AI and edge computing leader like NVIDIA supports our mission to deliver more diversified embedded products and solutions at higher quality standards," said Alex Hsueh, Senior Director of AAEON's System Platform Division. "These new offerings powered by the Jetson platform complement our existing lineup of rugged embedded products, providing an optimal combination of performance and price in a smaller form factor for customers to easily deploy across a full range of applications."

NVIDIA RTX Voice Modded to Work on Non-RTX GeForce GPUs

NVIDIA made headlines with the release of its RTX Voice free software, which gives your communication apps computational noise-cancellation, by leveraging AI. The software is very effective at what it does, but requires a GeForce RTX 20-series GPU. PC enthusiast David Lake, over at Guru3D Forums disagrees. With fairly easy modifications to its installer payload, Lake was able to remove its system requirements gate, and get it to install on his machine with a TITAN V graphics card, and find that the software works as intended.

Our first instinct was to point out that the "Volta" based TITAN V features tensor cores, and has hardware AI capabilities, until we found dozens of users across Guru3D forums, Reddit, and Twitter claiming that the mod gets RTX Voice to work on their GTX 16-series, "Pascal," "Maxwell," and even older "Fermi" hardware. So in all likelihood, RTX Voice uses a CUDA-based GPGPU codepath, rather than something fancy leveraging tensor cores. Find instructions on how to mod the RTX Voice installer in the Guru3D Forums thread here.

Three Unknown NVIDIA GPUs GeekBench Compute Score Leaked, Possibly Ampere?

(Update, March 4th: Another NVIDIA graphics card has been discovered in the Geekbench database, this one featuring a total of 124 CUs. This could amount to some 7,936 CUDA cores, should NVIDIA keep the same 64 CUDA cores per CU - though this has changed in the past, as when NVIDIA halved the number of CUDA cores per CU from Pascal to Turing. The 124 CU graphics card is clocked at 1.1 GHz and features 32 GB of HBM2e, delivering a score of 222,377 points in the Geekbench benchmark. We again stress that these can be just engineering samples, with conservative clocks, and that final performance could be even higher).

NVIDIA is expected to launch its next-generation Ampere lineup of GPUs during the GPU Technology Conference (GTC) event happening from March 22nd to March 26th. Just a few weeks before the release of these new GPUs, a Geekbench 5 compute score measuring OpenCL performance of the unknown GPUs, which we assume are a part of the Ampere lineup, has appeared. Thanks to the twitter user "_rogame" (@_rogame) who obtained a Geekbench database entry, we have some information about the CUDA core configuration, memory, and performance of the upcoming cards.
NVIDIA Ampere CUDA Information NVIDIA Ampere Geekbench

NVIDIA to Reuse Pascal for Mobility-geared MX300 Series

NVIDIA will apparently still be using Pascal when they launch their next generation of low-power discrete graphics solutions for mobile systems. The MX300 series will replace the current crop of MX200 series (segregated in three products in the form of the MX230, MX250 10 W and MX250 25 W). The new MX300 keeps the dual-tiered system, but ups the ante on the top of the line MX350. Even though it's still Pascal, on a 14 nm process, the MX350 should see an increase in CUDA cores to 640 (by using NVIDIA's Pascal GP107 chip) from the MX250's 384. Performance, then, should be comparable to the NVIDIA GTX 1050.

The MX330, on the other hand, will keep specifications of the MX250, which signals a tier increase from the 256 execution units in the MX230 to 384. This should translate to appreciable performance increases for the new MX300 series, despite staying on NVIDIA's Pascal architecture. The new lineup is expected to be announced on February.

Rumor: NVIDIA's Next Generation GeForce RTX 3080 and RTX 3070 "Ampere" Graphics Cards Detailed

NVIDIA's next-generation of graphics cards codenamed Ampere is set to arrive sometime this year, presumably around GTC 2020 which takes place on March 22nd. Before the CEO of NVIDIA, Jensen Huang officially reveals the specifications of these new GPUs, we have the latest round of rumors coming our way. According to VideoCardz, which cites multiple sources, the die configurations of the upcoming GeForce RTX 3070 and RTX 3080 have been detailed. Using the latest 7 nm manufacturing process from Samsung, this generation of NVIDIA GPU offers a big improvement from the previous generation.

For starters the two dies which have appeared have codenames like GA103 and GA104, standing for RTX 3080 and RTX 3070 respectively. Perhaps the biggest surprise is the Streaming Multiprocessor (SM) count. The smaller GA104 die has as much as 48 SMs, resulting in 3072 CUDA cores, while the bigger, oddly named, GA103 die has as much as 60 SMs that result in 3840 CUDA cores in total. These improvements in SM count should result in a notable performance increase across the board. Alongside the increase in SM count, there is also a new memory bus width. The smaller GA104 die that should end up in RTX 3070 uses a 256-bit memory bus allowing for 8/16 GB of GDDR6 memory, while its bigger brother, the GA103, has a 320-bit wide bus that allows the card to be configured with either 10 or 20 GB of GDDR6 memory. In the images below you can check out the alleged diagrams for yourself and see if this looks fake or not, however, it is recommended to take this rumor with a grain of salt.

NVIDIA Introduces DRIVE AGX Orin Platform

NVIDIA today introduced NVIDIA DRIVE AGX Orin, a highly advanced software-defined platform for autonomous vehicles and robots. The platform is powered by a new system-on-a-chip (SoC) called Orin, which consists of 17 billion transistors and is the result of four years of R&D investment. The Orin SoC integrates NVIDIA's next-generation GPU architecture and Arm Hercules CPU cores, as well as new deep learning and computer vision accelerators that, in aggregate, deliver 200 trillion operations per second—nearly 7x the performance of NVIDIA's previous generation Xavier SoC.

Orin is designed to handle the large number of applications and deep neural networks that run simultaneously in autonomous vehicles and robots, while achieving systematic safety standards such as ISO 26262 ASIL-D. Built as a software-defined platform, DRIVE AGX Orin is developed to enable architecturally compatible platforms that scale from a Level 2 to full self-driving Level 5 vehicle, enabling OEMs to develop large-scale and complex families of software products. Since both Orin and Xavier are programmable through open CUDA and TensorRT APIs and libraries, developers can leverage their investments across multiple product generations.

NVIDIA and Tech Leaders Team to Build GPU-Accelerated Arm Servers

NVIDIA today introduced a reference design platform that enables companies to quickly build GPU-accelerated Arm -based servers, driving a new era of high performance computing for a growing range of applications in science and industry.

Announced by NVIDIA founder and CEO Jensen Huang at the SC19 supercomputing conference, the reference design platform — consisting of hardware and software building blocks — responds to growing demand in the HPC community to harness a broader range of CPU architectures. It allows supercomputing centers, hyperscale-cloud operators and enterprises to combine the advantage of NVIDIA's accelerated computing platform with the latest Arm-based server platforms.

New NVIDIA EGX Edge Supercomputing Platform Accelerates AI, IoT, 5G at the Edge

NVIDIA today announced the NVIDIA EGX Edge Supercomputing Platform - a high-performance, cloud-native platform that lets organizations harness rapidly streaming data from factory floors, manufacturing inspection lines and city streets to securely deliver next-generation AI, IoT and 5G-based services at scale, with low latency.

Early adopters of the platform - which combines NVIDIA CUDA-X software with NVIDIA-certified GPU servers and devices - include Walmart, BMW, Procter & Gamble, Samsung Electronics and NTT East, as well as the cities of San Francisco and Las Vegas.

Primate Labs Introduces GeekBench 5, Drops 32-bit Support

Primate Labs, developers of the ubiquitous benchmarking application GeekBench, have announced the release of version 5 of the software. The new version brings numerous changes, and one of the most important (since if affects compatibility) is that it will only be distributed in a 64-bit version. Some under the hood changes include additions to the CPU benchmark tests (including machine learning, augmented reality, and computational photography) as well as increases in the memory footprint for tests so as to better gauge impacts of your memory subsystem on your system's performance. Also introduced are different threading models for CPU benchmarking, allowing for changes in workload attribution and the corresponding impact on CPU performance.

On the Compute side of things, GeekBench 5 now supports the Vulkan API, which joins CUDA, Metal, and OpenCL. GPU-accelerated compute for computer vision tasks such as Stereo Matching, and augmented reality tasks such as Feature Matching are also available. For iOS users, there is now a Dark Mode for the results interface. GeekBench 5 is available now, 50% off, on Primate Labs' store.

NVIDIA Brings CUDA to ARM, Enabling New Path to Exascale Supercomputing

NVIDIA today announced its support for Arm CPUs, providing the high performance computing industry a new path to build extremely energy-efficient, AI-enabled exascale supercomputers. NVIDIA is making available to the Arm ecosystem its full stack of AI and HPC software - which accelerates more than 600 HPC applications and all AI frameworks - by year's end. The stack includes all NVIDIA CUDA-X AI and HPC libraries, GPU-accelerated AI frameworks and software development tools such as PGI compilers with OpenACC support and profilers. Once stack optimization is complete, NVIDIA will accelerate all major CPU architectures, including x86, POWER and Arm.

"Supercomputers are the essential instruments of scientific discovery, and achieving exascale supercomputing will dramatically expand the frontier of human knowledge," said Jensen Huang, founder and CEO of NVIDIA. "As traditional compute scaling ends, power will limit all supercomputers. The combination of NVIDIA's CUDA-accelerated computing and Arm's energy-efficient CPU architecture will give the HPC community a boost to exascale."

NVIDIA's SUPER Tease Rumored to Translate Into an Entire Lineup Shift Upwards for Turing

NVIDIA's SUPER teaser hasn't crystallized into something physical as of now, but we know it's coming - NVIDIA themselves saw to it that our (singularly) collective minds would be buzzing about what that teaser meant, looking to steal some thunder from AMD's E3 showing. Now, that teaser seems to be coalescing into something amongst the industry: an entire lineup upgrade for Turing products, with NVIDIA pulling their chips up one rung of the performance chair across their entire lineup.

Apparently, NVIDIA will be looking to increase performance across the board, by shuffling their chips in a downward manner whilst keeping the current pricing structure. This means that NVIDIA's TU106 chip, which powered their RTX 2070 graphics card, will now be powering the RTX 2060 SUPER (with a reported core count of 2176 CUDA cores). The TU104 chip, which power the current RTX 2080, will in the meantime be powering the SUPER version of the RTX 2070 (a reported 2560 CUDA cores are expected to be onboard), and the TU102 chip which powered their top-of-the-line RTX 2080 Ti will be brought down to the RTX 2080 SUPER (specs place this at 8 GB GDDR6 VRAM and 3072 CUDA cores). This carves the way for an even more powerful SKU in the RTX 2080 Ti SUPER, which should be launched at a later date. Salty waters say the RTX 2080 Ti SUPER will feature and unlocked chip which could be allowed to convert up to 300 W into graphics horsepower, so that's something to keep an eye - and a power meter on - for sure. Less defined talks suggest that NVIDIA will be introducing an RTX 2070 Ti SUPER equivalent with a new chip as well.

Manli Introduces its GeForce GTX 1650 Graphics Card Lineup

Manli Technology Group Limited, the major Graphics Cards, and other components manufacturer, today announced the affordable new member within the 16 series family - Manli GeForce GTX 1650. Manli GeForce GTX 1650 is powered by award-winning NVIDIA Turing architecture. It is also equipped with 4 GB of GDDR5, 128-bit memory controller, and built-in 896 CUDA Cores with core frequency set at 1485 MHz which can dynamically boost up to 1665 MHz. Moreover, Manli GeForce GTX 1650 has less power consumption with only 75W, and no external power supply required.

NVIDIA Extends DirectX Raytracing (DXR) Support to Many GeForce GTX GPUs

NVIDIA today announced that it is extending DXR (DirectX Raytracing) support to several GeForce GTX graphics models beyond its GeForce RTX series. These include the GTX 1660 Ti, GTX 1660, GTX 1080 Ti, GTX 1080, GTX 1070 Ti, GTX 1070, and GTX 1060 6 GB. The GTX 1060 3 GB and lower "Pascal" models don't support DXR, nor do older generations of NVIDIA GPUs. NVIDIA has implemented real-time raytracing on GPUs without specialized components such as RT cores or tensor cores, by essentially implementing the rendering path through shaders, in this case, CUDA cores. DXR support will be added through a new GeForce graphics driver later today.

The GPU's CUDA cores now have to calculate BVR, intersection, reflection, and refraction. The GTX 16-series chips have an edge over "Pascal" despite lacking RT cores, as the "Turing" CUDA cores support concurrent INT and FP execution, allowing more work to be done per clock. NVIDIA in a detailed presentation listed out the kinds of real-time ray-tracing effects available by the DXR API, namely reflections, shadows, advanced reflections and shadows, ambient occlusion, global illumination (unbaked), and combinations of these. The company put out detailed performance numbers for a selection of GTX 10-series and GTX 16-series GPUs, and compared them to RTX 20-series SKUs that have specialized hardware for DXR.
Update: Article updated with additional test data from NVIDIA.

Details on GeForce GTX 1660 Revealed Courtesy of MSI - 1408 CUDA Cores, GDDR 5 Memory

Details on NVIDIA's upcoming mainstream GTX 1660 graphics card have been revealed, which will help put its graphics-cruncinh prowess up to scrutiny. The new graphics card from NVIDIA slots in below the recently released GTX 1660 Ti (which provides roughly 5% better performance than NVIDIA's previous GTX 1070 graphics card) and above the yet-to-be-released GTX 1650.

The 1408 CUDA cores in the design amount to a 9% reduction in computing cores compared to the GTX 1660 Ti, but most of the savings (and performance impact) likely comes at the expense of the 6 GB (8 Gbps) GDDR5 memory this card is outfitted with, compared to the 1660 Ti's still GDDR6 implementation. The amount of cut GPU resources form NVIDIA is so low that we imagine these chips won't be coming from harvesting defective dies as much as from actually fusing off CUDA cores present in the TU116 chip. Using GDDR5 is still cheaper than the GDDR6 alternative (for now), and this also avoids straining the GDDR6 supply (if that was ever a concern for NVIDIA).

NVIDIA Adds New Options to Its MX200 Mobile Graphics Solutions - MX250 and MX230

NVIDIA has added new SKUs to its low power mobility graphics lineup. the MX230 and MX250 come in to replace The GeForce MX130 and MX150, but... there's really not that much of a performance improvement to justify the increase in the series' tier. Both solutions are based on Pascal, so there are no Turing performance uplifts at the execution level.

NVIDIA hasn't disclosed any CUDA core counts or other specifics on these chips; we only know that they are paired with GDDR 5 memory and feature Boost functionality for increased performance in particular scenarios. The strange thing is that NVIDIA's own performance scores compare their MX 130, MX150, and now MX230 and MX250 to Intel's UHD620 IGP part... and while the old MX150 was reported by NVIDIA as offering an up to 4x performance uplift compared to that Intel part, the new MX250 now claims an improvement of 3.5x the performance. Whether this is because of new testing methodology, or some other reason, only NVIDIA knows.
Return to Keyword Browsing