News Posts matching #CUDA

Dynics Announces AI-enabled Vision System Powered by NVIDIA T4 Tensor Core GPU

Press Release by

Jul 6th, 2020 00:20 Discuss (0 Comments)

Dynics, Inc., a U.S.-based manufacturer of industrial-grade computer hardware, visualization software, network security, network monitoring and software-defined networking solutions, today announced the XiT4 Inference Server, which helps industrial manufacturing companies increase their yield and provide more consistent manufacturing quality.

Artificial intelligence (AI) is increasingly being integrated into modern manufacturing to improve and automate processes, including 3D vision applications. The XiT4 Inference Server, powered by the NVIDIA T4 Tensor Core GPUs, is a fan-less hardware platform for AI, machine learning and 3D vision applications. AI technology is allowing manufacturers to increase efficiency and throughput of their production, while also providing more consistent quality due to higher accuracy and repeatability. Additional benefits are fewer false negatives (test escapes) and fewer false positives, which reduce downstream re-inspection needs, all leading to lower costs of manufacturing.

Read full story

GALAX Designs a GeForce GTX 1650 "Ultra" with TU106 Silicon

btarunr

Jun 29th, 2020 13:04 Discuss (11 Comments)

NVIDIA board partners carving out GeForce RTX 20-series and GTX 16-series SKUs from ASICs they weren't originally based on, is becoming more common, but GALAX has taken things a step further. The company just launched a GeForce GTX 1650 (GDDR6) graphics card based on the "TU106" silicon (ASIC code: TU106-125-A1). The company carved a GTX 1650 out of this chip by disabling all of its RT cores, all its tensor cores, and a whopping 61% of its CUDA cores, along with proportionate reductions in TMU- and ROP counts. The memory bus width has been halved from 256-bit down to 128-bit.

The card, however, is only listed by the Chinese regional arm of GALAX. The card's marketing name is "GALAX GeForce GTX 1650 Ultra," with "Ultra" being a GALAX brand extension, and not an NVIDIA SKU (i.e. the GPU isn't called "GTX 1650 Ultra"). The GPU clock speeds for this card is identical to those of the original GTX 1650 that's based on TU117 - 1410 MHz base, 1590 MHz GPU Boost, and 12 Gbps (GDDR6-effective) memory.

Aetina Launches New Edge AI Computer Powered by the NVIDIA Jetson

Press Release by

btarunr

Jun 7th, 2020 22:08 Discuss (0 Comments)

Aetina Corp., a provider of high-performance GPGPU solutions, announced the new AN110-XNX edge AI computer leveraging the powerful capabilities of the NVIDIA Jetson Xavier NX, expanding its range of edge AI systems built on the Jetson platform for applications in smart transportation, factories, retail, healthcare, AIoT, robotics, and more.

The AN110-XNX combines the NVIDIA Jetson Xavier NX and Aetina AN110 carrier board in a compact form factor of 87.4 x 68.2 x 52 mm (with fan). AN110-XNX supports the MIPI CSI-2 interface for 1x4k or 2xFHD cameras to handle intensive AI workloads from ultra-high-resolution cameras to more accurate image analysis. It is as small as Aetina's AN110-NAO based on the NVIDIA Jetson Nano platform, but delivers more powerful AI computing via the new Jetson Xavier NX. With 384 CUDA cores, 48 Tensor Cores, and cloud-native capability the Jetson Xavier NX delivers up to 21 TOPS and is the ideal platform to accelerate AI applications. Bundled with the latest NVIDIA Jetpack 4.4 SDK, the energy-efficient module significantly expands the choices now available for developers and customers looking for embedded edge-computing options that demand increased performance to support AI workloads but are constrained by size, weight, power budget, or cost.

Read full story

DirectX Coming to Linux...Sort of

btarunr

May 19th, 2020 16:27 Discuss (37 Comments)

Microsoft is preparing to add the DirectX API support to WSL (Windows Subsystem for Linux). The latest Windows Subsystem for Linux 2 will virtualize DirectX to Linux applications running on top of it. WSL is a translation layer for Linux apps to run on top of Windows. Unlike Wine, which attempts to translate Direct3D commands to OpenGL, what Microsoft is proposing is a real DirectX interface for apps in WSL, which can essentially talk to hardware (the host's kernel-mode GPU driver) directly.

To this effect, Microsoft introduced the Linux-edition of DXGkrnl, a new kernel-mode driver for Linux that talks to the DXGkrnl driver of the Windows host. With this, Microsoft is promising to expose the full Direct3D 12, DxCore, and DirectML. It will also serve as a conduit for third party APIs, such as OpenGL, OpenCL, Vulkan, and CUDA. Microsoft expects to release this feature-packed WSL out with WDDM 2.9 (so a future version of Windows 10).

AAEON Unveils AI and Edge Computing Solutions Powered by NVIDIA

Press Release by

btarunr

Apr 22nd, 2020 02:52 Discuss (0 Comments)

AAEON, a leading developer of embedded AI and edge-computing solutions, today announced it is unveiling several new rugged embedded platforms—augmenting an already extensive lineup of AAEON AI edge-computing solutions powered by the NVIDIA Jetson platform. The new AAEON products provide key interfaces needed for edge computing in a small form factor, making it easier to build applications for all levels of users, from makers to more advanced developers for deployments in the field.

AAEON also introduced a new version of the popular BOXER-8120AI, now featuring the Jetson TX2 4 GB module, providing an efficient and cost-effective solution for AI edge computing with 256 CUDA cores delivering processing speeds up to 1.3 TFLOPS."Partnering with an AI and edge computing leader like NVIDIA supports our mission to deliver more diversified embedded products and solutions at higher quality standards," said Alex Hsueh, Senior Director of AAEON's System Platform Division. "These new offerings powered by the Jetson platform complement our existing lineup of rugged embedded products, providing an optimal combination of performance and price in a smaller form factor for customers to easily deploy across a full range of applications."

Read full story

NVIDIA RTX Voice Modded to Work on Non-RTX GeForce GPUs

btarunr

Apr 22nd, 2020 02:03 Discuss (30 Comments)

NVIDIA made headlines with the release of its RTX Voice free software, which gives your communication apps computational noise-cancellation, by leveraging AI. The software is very effective at what it does, but requires a GeForce RTX 20-series GPU. PC enthusiast David Lake, over at Guru3D Forums disagrees. With fairly easy modifications to its installer payload, Lake was able to remove its system requirements gate, and get it to install on his machine with a TITAN V graphics card, and find that the software works as intended.

Our first instinct was to point out that the "Volta" based TITAN V features tensor cores, and has hardware AI capabilities, until we found dozens of users across Guru3D forums, Reddit, and Twitter claiming that the mod gets RTX Voice to work on their GTX 16-series, "Pascal," "Maxwell," and even older "Fermi" hardware. So in all likelihood, RTX Voice uses a CUDA-based GPGPU codepath, rather than something fancy leveraging tensor cores. Find instructions on how to mod the RTX Voice installer in the Guru3D Forums thread here.

Three Unknown NVIDIA GPUs GeekBench Compute Score Leaked, Possibly Ampere?

Updated by

AleksandarK

Mar 2nd, 2020 01:13 Updated: Mar 4th, 2020 13:52 Discuss (62 Comments)

(Update, March 4th: Another NVIDIA graphics card has been discovered in the Geekbench database, this one featuring a total of 124 CUs. This could amount to some 7,936 CUDA cores, should NVIDIA keep the same 64 CUDA cores per CU - though this has changed in the past, as when NVIDIA halved the number of CUDA cores per CU from Pascal to Turing. The 124 CU graphics card is clocked at 1.1 GHz and features 32 GB of HBM2e, delivering a score of 222,377 points in the Geekbench benchmark. We again stress that these can be just engineering samples, with conservative clocks, and that final performance could be even higher).

NVIDIA is expected to launch its next-generation Ampere lineup of GPUs during the GPU Technology Conference (GTC) event happening from March 22nd to March 26th. Just a few weeks before the release of these new GPUs, a Geekbench 5 compute score measuring OpenCL performance of the unknown GPUs, which we assume are a part of the Ampere lineup, has appeared. Thanks to the twitter user "_rogame" (@_rogame) who obtained a Geekbench database entry, we have some information about the CUDA core configuration, memory, and performance of the upcoming cards.

Read full story

NVIDIA to Reuse Pascal for Mobility-geared MX300 Series

Raevenlord

Jan 20th, 2020 15:51 Discuss (34 Comments)

NVIDIA will apparently still be using Pascal when they launch their next generation of low-power discrete graphics solutions for mobile systems. The MX300 series will replace the current crop of MX200 series (segregated in three products in the form of the MX230, MX250 10 W and MX250 25 W). The new MX300 keeps the dual-tiered system, but ups the ante on the top of the line MX350. Even though it's still Pascal, on a 14 nm process, the MX350 should see an increase in CUDA cores to 640 (by using NVIDIA's Pascal GP107 chip) from the MX250's 384. Performance, then, should be comparable to the NVIDIA GTX 1050.

The MX330, on the other hand, will keep specifications of the MX250, which signals a tier increase from the 256 execution units in the MX230 to 384. This should translate to appreciable performance increases for the new MX300 series, despite staying on NVIDIA's Pascal architecture. The new lineup is expected to be announced on February.

Rumor: NVIDIA's Next Generation GeForce RTX 3080 and RTX 3070 "Ampere" Graphics Cards Detailed

AleksandarK

Jan 20th, 2020 09:18 Discuss (173 Comments)

NVIDIA's next-generation of graphics cards codenamed Ampere is set to arrive sometime this year, presumably around GTC 2020 which takes place on March 22nd. Before the CEO of NVIDIA, Jensen Huang officially reveals the specifications of these new GPUs, we have the latest round of rumors coming our way. According to VideoCardz, which cites multiple sources, the die configurations of the upcoming GeForce RTX 3070 and RTX 3080 have been detailed. Using the latest 7 nm manufacturing process from Samsung, this generation of NVIDIA GPU offers a big improvement from the previous generation.

For starters the two dies which have appeared have codenames like GA103 and GA104, standing for RTX 3080 and RTX 3070 respectively. Perhaps the biggest surprise is the Streaming Multiprocessor (SM) count. The smaller GA104 die has as much as 48 SMs, resulting in 3072 CUDA cores, while the bigger, oddly named, GA103 die has as much as 60 SMs that result in 3840 CUDA cores in total. These improvements in SM count should result in a notable performance increase across the board. Alongside the increase in SM count, there is also a new memory bus width. The smaller GA104 die that should end up in RTX 3070 uses a 256-bit memory bus allowing for 8/16 GB of GDDR6 memory, while its bigger brother, the GA103, has a 320-bit wide bus that allows the card to be configured with either 10 or 20 GB of GDDR6 memory. In the images below you can check out the alleged diagrams for yourself and see if this looks fake or not, however, it is recommended to take this rumor with a grain of salt.

NVIDIA Introduces DRIVE AGX Orin Platform

Press Release by

AleksandarK

Dec 18th, 2019 01:20 Discuss (9 Comments)

NVIDIA today introduced NVIDIA DRIVE AGX Orin, a highly advanced software-defined platform for autonomous vehicles and robots. The platform is powered by a new system-on-a-chip (SoC) called Orin, which consists of 17 billion transistors and is the result of four years of R&D investment. The Orin SoC integrates NVIDIA's next-generation GPU architecture and Arm Hercules CPU cores, as well as new deep learning and computer vision accelerators that, in aggregate, deliver 200 trillion operations per second—nearly 7x the performance of NVIDIA's previous generation Xavier SoC.

Orin is designed to handle the large number of applications and deep neural networks that run simultaneously in autonomous vehicles and robots, while achieving systematic safety standards such as ISO 26262 ASIL-D. Built as a software-defined platform, DRIVE AGX Orin is developed to enable architecturally compatible platforms that scale from a Level 2 to full self-driving Level 5 vehicle, enabling OEMs to develop large-scale and complex families of software products. Since both Orin and Xavier are programmable through open CUDA and TensorRT APIs and libraries, developers can leverage their investments across multiple product generations.

Read full story

NVIDIA and Tech Leaders Team to Build GPU-Accelerated Arm Servers

Press Release by

AleksandarK

Nov 19th, 2019 10:15 Discuss (4 Comments)

NVIDIA today introduced a reference design platform that enables companies to quickly build GPU-accelerated Arm -based servers, driving a new era of high performance computing for a growing range of applications in science and industry.

Announced by NVIDIA founder and CEO Jensen Huang at the SC19 supercomputing conference, the reference design platform — consisting of hardware and software building blocks — responds to growing demand in the HPC community to harness a broader range of CPU architectures. It allows supercomputing centers, hyperscale-cloud operators and enterprises to combine the advantage of NVIDIA's accelerated computing platform with the latest Arm-based server platforms.

Read full story

New NVIDIA EGX Edge Supercomputing Platform Accelerates AI, IoT, 5G at the Edge

Press Release by

AleksandarK

Oct 22nd, 2019 01:05 Discuss (10 Comments)

NVIDIA today announced the NVIDIA EGX Edge Supercomputing Platform - a high-performance, cloud-native platform that lets organizations harness rapidly streaming data from factory floors, manufacturing inspection lines and city streets to securely deliver next-generation AI, IoT and 5G-based services at scale, with low latency.

Early adopters of the platform - which combines NVIDIA CUDA-X software with NVIDIA-certified GPU servers and devices - include Walmart, BMW, Procter & Gamble, Samsung Electronics and NTT East, as well as the cities of San Francisco and Las Vegas.

Read full story

Primate Labs Introduces GeekBench 5, Drops 32-bit Support

Raevenlord

Sep 3rd, 2019 11:42 Discuss (12 Comments)

Primate Labs, developers of the ubiquitous benchmarking application GeekBench, have announced the release of version 5 of the software. The new version brings numerous changes, and one of the most important (since if affects compatibility) is that it will only be distributed in a 64-bit version. Some under the hood changes include additions to the CPU benchmark tests (including machine learning, augmented reality, and computational photography) as well as increases in the memory footprint for tests so as to better gauge impacts of your memory subsystem on your system's performance. Also introduced are different threading models for CPU benchmarking, allowing for changes in workload attribution and the corresponding impact on CPU performance.

On the Compute side of things, GeekBench 5 now supports the Vulkan API, which joins CUDA, Metal, and OpenCL. GPU-accelerated compute for computer vision tasks such as Stereo Matching, and augmented reality tasks such as Feature Matching are also available. For iOS users, there is now a Dark Mode for the results interface. GeekBench 5 is available now, 50% off, on Primate Labs' store.

NVIDIA Brings CUDA to ARM, Enabling New Path to Exascale Supercomputing

Press Release by

btarunr

Jun 17th, 2019 09:41 Discuss (24 Comments)

NVIDIA today announced its support for Arm CPUs, providing the high performance computing industry a new path to build extremely energy-efficient, AI-enabled exascale supercomputers. NVIDIA is making available to the Arm ecosystem its full stack of AI and HPC software - which accelerates more than 600 HPC applications and all AI frameworks - by year's end. The stack includes all NVIDIA CUDA-X AI and HPC libraries, GPU-accelerated AI frameworks and software development tools such as PGI compilers with OpenACC support and profilers. Once stack optimization is complete, NVIDIA will accelerate all major CPU architectures, including x86, POWER and Arm.

"Supercomputers are the essential instruments of scientific discovery, and achieving exascale supercomputing will dramatically expand the frontier of human knowledge," said Jensen Huang, founder and CEO of NVIDIA. "As traditional compute scaling ends, power will limit all supercomputers. The combination of NVIDIA's CUDA-accelerated computing and Arm's energy-efficient CPU architecture will give the HPC community a boost to exascale."

Read full story

NVIDIA's SUPER Tease Rumored to Translate Into an Entire Lineup Shift Upwards for Turing

Raevenlord

Jun 12th, 2019 15:34 Discuss (126 Comments)

NVIDIA's SUPER teaser hasn't crystallized into something physical as of now, but we know it's coming - NVIDIA themselves saw to it that our (singularly) collective minds would be buzzing about what that teaser meant, looking to steal some thunder from AMD's E3 showing. Now, that teaser seems to be coalescing into something amongst the industry: an entire lineup upgrade for Turing products, with NVIDIA pulling their chips up one rung of the performance chair across their entire lineup.

Apparently, NVIDIA will be looking to increase performance across the board, by shuffling their chips in a downward manner whilst keeping the current pricing structure. This means that NVIDIA's TU106 chip, which powered their RTX 2070 graphics card, will now be powering the RTX 2060 SUPER (with a reported core count of 2176 CUDA cores). The TU104 chip, which power the current RTX 2080, will in the meantime be powering the SUPER version of the RTX 2070 (a reported 2560 CUDA cores are expected to be onboard), and the TU102 chip which powered their top-of-the-line RTX 2080 Ti will be brought down to the RTX 2080 SUPER (specs place this at 8 GB GDDR6 VRAM and 3072 CUDA cores). This carves the way for an even more powerful SKU in the RTX 2080 Ti SUPER, which should be launched at a later date. Salty waters say the RTX 2080 Ti SUPER will feature and unlocked chip which could be allowed to convert up to 300 W into graphics horsepower, so that's something to keep an eye - and a power meter on - for sure. Less defined talks suggest that NVIDIA will be introducing an RTX 2070 Ti SUPER equivalent with a new chip as well.

Read full story

Manli Introduces its GeForce GTX 1650 Graphics Card Lineup

Press Release by

btarunr

Apr 23rd, 2019 23:51 Discuss (3 Comments)

Manli Technology Group Limited, the major Graphics Cards, and other components manufacturer, today announced the affordable new member within the 16 series family - Manli GeForce GTX 1650. Manli GeForce GTX 1650 is powered by award-winning NVIDIA Turing architecture. It is also equipped with 4 GB of GDDR5, 128-bit memory controller, and built-in 896 CUDA Cores with core frequency set at 1485 MHz which can dynamically boost up to 1665 MHz. Moreover, Manli GeForce GTX 1650 has less power consumption with only 75W, and no external power supply required.

Read full story

NVIDIA Extends DirectX Raytracing (DXR) Support to Many GeForce GTX GPUs

Updated by

btarunr

Apr 11th, 2019 09:00 Updated: Apr 11th, 2019 11:40 Discuss (111 Comments)

NVIDIA today announced that it is extending DXR (DirectX Raytracing) support to several GeForce GTX graphics models beyond its GeForce RTX series. These include the GTX 1660 Ti, GTX 1660, GTX 1080 Ti, GTX 1080, GTX 1070 Ti, GTX 1070, and GTX 1060 6 GB. The GTX 1060 3 GB and lower "Pascal" models don't support DXR, nor do older generations of NVIDIA GPUs. NVIDIA has implemented real-time raytracing on GPUs without specialized components such as RT cores or tensor cores, by essentially implementing the rendering path through shaders, in this case, CUDA cores. DXR support will be added through a new GeForce graphics driver later today.

The GPU's CUDA cores now have to calculate BVR, intersection, reflection, and refraction. The GTX 16-series chips have an edge over "Pascal" despite lacking RT cores, as the "Turing" CUDA cores support concurrent INT and FP execution, allowing more work to be done per clock. NVIDIA in a detailed presentation listed out the kinds of real-time ray-tracing effects available by the DXR API, namely reflections, shadows, advanced reflections and shadows, ambient occlusion, global illumination (unbaked), and combinations of these. The company put out detailed performance numbers for a selection of GTX 10-series and GTX 16-series GPUs, and compared them to RTX 20-series SKUs that have specialized hardware for DXR.

Update: Article updated with additional test data from NVIDIA.

Read full story

Details on GeForce GTX 1660 Revealed Courtesy of MSI - 1408 CUDA Cores, GDDR 5 Memory

Raevenlord

Mar 8th, 2019 14:30 Discuss (55 Comments)

Details on NVIDIA's upcoming mainstream GTX 1660 graphics card have been revealed, which will help put its graphics-cruncinh prowess up to scrutiny. The new graphics card from NVIDIA slots in below the recently released GTX 1660 Ti (which provides roughly 5% better performance than NVIDIA's previous GTX 1070 graphics card) and above the yet-to-be-released GTX 1650.

The 1408 CUDA cores in the design amount to a 9% reduction in computing cores compared to the GTX 1660 Ti, but most of the savings (and performance impact) likely comes at the expense of the 6 GB (8 Gbps) GDDR5 memory this card is outfitted with, compared to the 1660 Ti's still GDDR6 implementation. The amount of cut GPU resources form NVIDIA is so low that we imagine these chips won't be coming from harvesting defective dies as much as from actually fusing off CUDA cores present in the TU116 chip. Using GDDR5 is still cheaper than the GDDR6 alternative (for now), and this also avoids straining the GDDR6 supply (if that was ever a concern for NVIDIA).

Read full story

NVIDIA Adds New Options to Its MX200 Mobile Graphics Solutions - MX250 and MX230

Raevenlord

Feb 21st, 2019 14:23 Discuss (9 Comments)

NVIDIA has added new SKUs to its low power mobility graphics lineup. the MX230 and MX250 come in to replace The GeForce MX130 and MX150, but... there's really not that much of a performance improvement to justify the increase in the series' tier. Both solutions are based on Pascal, so there are no Turing performance uplifts at the execution level.

NVIDIA hasn't disclosed any CUDA core counts or other specifics on these chips; we only know that they are paired with GDDR 5 memory and feature Boost functionality for increased performance in particular scenarios. The strange thing is that NVIDIA's own performance scores compare their MX 130, MX150, and now MX230 and MX250 to Intel's UHD620 IGP part... and while the old MX150 was reported by NVIDIA as offering an up to 4x performance uplift compared to that Intel part, the new MX250 now claims an improvement of 3.5x the performance. Whether this is because of new testing methodology, or some other reason, only NVIDIA knows.

NVIDIA Readies GeForce GTX 1660 Ti Based on TU116, Sans RTX

Updated by

btarunr

Jan 16th, 2019 22:27 Updated: Jan 17th, 2019 00:47 Discuss (64 Comments)

It looks like RTX technology won't make it to sub-$250 market segments as the GPUs aren't fast enough to handle real-time raytracing, and it makes little economic sense for NVIDIA to add billions of additional transistors for RT cores. The company is hence carving out a sub-class of "Turing" GPUs under the TU11x ASIC series, which will power new GeForce GTX family SKUs, such as the GeForce GTX 1660 Ti, and other GTX 1000-series SKUs. These chips offer "Turing Shaders," which are basically CUDA cores that have the IPC and clock-speeds rivaling existing "Turing" GPUs, but no RTX capabilities. To sweeten the deal, NVIDIA will equip these cards with GDDR6 memory. These GPUs could still have tensor cores which are needed to accelerate DLSS, a feature highly relevant to this market segment.

The GeForce GTX 1660 Ti will no doubt be slower than the RTX 2060, and be based on a new ASIC codenamed TU116. According to a VideoCardz report, this 12 nm chip packs 1,536 CUDA cores based on the "Turing" architecture, and the same exact memory setup as the RTX 2060, with 6 GB of GDDR6 memory across a 192-bit wide memory interface. The lack of RT cores and a lower CUDA core count could make the TU116 a significantly smaller chip than the TU106, and something NVIDIA can afford to sell at sub-$300 price-points such as $250. The GTX 1060 6 GB is holding the fort for NVIDIA in this segment, besides other GTX 10-series SKUs such as the GTX 1070 occasionally dropping below the $300 mark at retailers' mercy. AMD recently improved its sub-$300 portfolio with the introduction of Radeon RX 590, which convincingly outperforms the GTX 1060 6 GB.

NVIDIA Introduces RAPIDS Open-Source GPU-Acceleration Platform

Press Release by

btarunr

Oct 11th, 2018 01:22 Discuss (18 Comments)

NVIDIA today announced a GPU-acceleration platform for data science and machine learning, with broad adoption from industry leaders, that enables even the largest companies to analyze massive amounts of data and make accurate business predictions at unprecedented speed.

RAPIDS open-source software gives data scientists a giant performance boost as they address highly complex business challenges, such as predicting credit card fraud, forecasting retail inventory and understanding customer buying behavior. Reflecting the growing consensus about the GPU's importance in data analytics, an array of companies is supporting RAPIDS - from pioneers in the open-source community, such as Databricks and Anaconda, to tech leaders like Hewlett Packard Enterprise, IBM and Oracle.

Read full story

VUDA is a CUDA-Like Programming Interface for GPU Compute on Vulkan (Open-Source)

W1zzard

Oct 8th, 2018 04:26 Discuss (33 Comments)

GitHub developer jgbit has started an open-source project called VUDA, which takes inspiration from NVIDIA's CUDA API to bring an easily accessible GPU compute interface to the open-source world. VUDA is implemented as wrapper on top of the highly popular next-gen graphics API Vulkan, which provides low-level access to hardware. VUDA comes as header-only C++ library, which means it's compatible with all platforms that have a C++ compiler and that support Vulkan.

While the project is still young, its potential is enormous, especially due to the open source nature (using the MIT license). The page on GitHub comes with a (very basic) sample, that could be a good start for using the library.

Intel is Adding Vulkan Support to Their OpenCV Library, First Signs of Discrete GPU?

W1zzard

Oct 1st, 2018 03:35 Discuss (0 Comments)

Intel has submitted the first patches with Vulkan support to their open-source OpenCV library, which is designed to accelerate Computer Vision. The library is widely used for real-time applications as it comes with 1st-class optimizations for Intel processors and multi-core x86 in general. With Vulkan support, existing users can immediately move their neural network workloads to the GPU compute space without having to rewrite their code base.

At this point in time, the Vulkan backend supports Convolution, Concat, ReLU, LRN, PriorBox, Softmax, MaxPooling, AvePooling, and Permute. According to the source code changes, this is just "a beginning work for Vulkan in OpenCV DNN, more layer types will be supported and performance tuning is on the way."

It seems that now, with their own GPU development underway, Intel has found new love for the GPU-accelerated compute space. The choice of Vulkan is also interesting as the API is available on a wide range of platforms, which could mean that Intel is trying to turn Vulkan into a CUDA killer. Of course there's still a lot of work needed to achieve that goal, since NVIDIA has had almost a decade of head start.

NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

btarunr

Aug 23rd, 2018 10:42 Discuss (65 Comments)

The GeForce RTX 2080 Ti is indeed based on an ASIC codenamed "TU102." NVIDIA was referring to this 775 mm² chip when talking about the 18.5 billion-transistor count in its keynote. The company also provided a breakdown of its various "cores," and a block-diagram. The GPU is still laid out like its predecessors, but each of the 72 streaming multiprocessors (SMs) packs RT cores and Tensor cores in addition to CUDA cores.

The TU102 features six GPCs (graphics processing clusters), which each pack 12 SMs. Each SM packs 64 CUDA cores, 8 Tensor cores, and 1 RT core. Each GPC packs six geometry units. The GPU also packs 288 TMUs and 96 ROPs. The TU102 supports a 384-bit wide GDDR6 memory bus, supporting 14 Gbps memory. There are also two NVLink channels, which NVIDIA plans to later launch as its next-generation multi-GPU technology.

NVIDIA GeForce RTX 2000 Series Specifications Pieced Together

btarunr

Aug 20th, 2018 01:41 Discuss (25 Comments)

Later today (20th August), NVIDIA will formally unveil its GeForce RTX 2000 series consumer graphics cards. This marks a major change in the brand name, triggered with the introduction of the new RT Cores, specialized components that accelerate real-time ray-tracing, a task too taxing on conventional CUDA cores. Ray-tracing and DNN acceleration requires SIMD components to crunch 4x4x4 matrix multiplication, which is what RT cores (and tensor cores) specialize at. The chips still have CUDA cores for everything else. This generation also debuts the new GDDR6 memory standard, although unlike GeForce "Pascal," the new GeForce "Turing" won't see a doubling in memory sizes.

NVIDIA is expected to debut the generation with the new GeForce RTX 2080 later today, with market availability by end of Month. Going by older rumors, the company could launch the lower RTX 2070 and higher RTX 2080+ by late-September, and the mid-range RTX 2060 series in October. Apparently the high-end RTX 2080 Ti could come out sooner than expected, given that VideoCardz already has some of its specifications in hand. Not a lot is known about how "Turing" compares with "Volta" in performance, but given that the TITAN V comes with tensor cores that can [in theory] be re-purposed as RT cores; it could continue on as NVIDIA's halo SKU for the client-segment.

Read full story

Return to Keyword Browsing

News Posts matching #CUDA

Dynics Announces AI-enabled Vision System Powered by NVIDIA T4 Tensor Core GPU

GALAX Designs a GeForce GTX 1650 "Ultra" with TU106 Silicon

Aetina Launches New Edge AI Computer Powered by the NVIDIA Jetson

DirectX Coming to Linux...Sort of

AAEON Unveils AI and Edge Computing Solutions Powered by NVIDIA

NVIDIA RTX Voice Modded to Work on Non-RTX GeForce GPUs

Three Unknown NVIDIA GPUs GeekBench Compute Score Leaked, Possibly Ampere?

NVIDIA to Reuse Pascal for Mobility-geared MX300 Series

Rumor: NVIDIA's Next Generation GeForce RTX 3080 and RTX 3070 "Ampere" Graphics Cards Detailed

NVIDIA Introduces DRIVE AGX Orin Platform

NVIDIA and Tech Leaders Team to Build GPU-Accelerated Arm Servers

New NVIDIA EGX Edge Supercomputing Platform Accelerates AI, IoT, 5G at the Edge

Primate Labs Introduces GeekBench 5, Drops 32-bit Support

NVIDIA Brings CUDA to ARM, Enabling New Path to Exascale Supercomputing

NVIDIA's SUPER Tease Rumored to Translate Into an Entire Lineup Shift Upwards for Turing

Manli Introduces its GeForce GTX 1650 Graphics Card Lineup

NVIDIA Extends DirectX Raytracing (DXR) Support to Many GeForce GTX GPUs

Details on GeForce GTX 1660 Revealed Courtesy of MSI - 1408 CUDA Cores, GDDR 5 Memory

NVIDIA Adds New Options to Its MX200 Mobile Graphics Solutions - MX250 and MX230

NVIDIA Readies GeForce GTX 1660 Ti Based on TU116, Sans RTX

NVIDIA Introduces RAPIDS Open-Source GPU-Acceleration Platform

VUDA is a CUDA-Like Programming Interface for GPU Compute on Vulkan (Open-Source)

Intel is Adding Vulkan Support to Their OpenCV Library, First Signs of Discrete GPU?

NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

NVIDIA GeForce RTX 2000 Series Specifications Pieced Together

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts