News Posts matching #GPU

Return to Keyword Browsing

Google: CPUs are Leading AI Inference Workloads, Not GPUs

The AI infrastructure of today is mostly fueled by the expansion that relies on GPU-accelerated servers. Google, one of the world's largest hyperscalers, has noted that CPUs are still a leading compute for AI/ML workloads, recorded on their Google Cloud Services cloud internal analysis. During the TechFieldDay event, a speech by Brandon Royal, product manager at Google Cloud, explained the position of CPUs in today's AI game. The AI lifecycle is divided into two parts: training and inference. During training, massive compute capacity is needed, along with enormous memory capacity, to fit ever-expanding AI models into memory. The latest models, like GPT-4 and Gemini, contain billions of parameters and require thousands of GPUs or other accelerators working in parallel to train efficiently.

On the other hand, inference requires less compute intensity but still benefits from acceleration. The pre-trained model is optimized and deployed during inference to make predictions on new data. While less compute is needed than training, latency and throughput are essential for real-time inference. Google found out that, while GPUs are ideal for the training phase, models are often optimized and run inference on CPUs. This means that there are customers who choose CPUs as their medium of AI inference for a wide variety of reasons.

AMD to Address "Bugged" Limited Overclocking on Radeon RX 7900 GRE GPU

TechPowerUp's resident GPU reviewer extraordinaire—W1zzard—has grappled with a handful of custom design AMD Radeon RX 7900 GRE 16 GB models. Team Red and its board partners are pushing a proper/widespread Western release of the formerly China market-exclusive "Golden Rabbit Edition" GPU. TPU's initial review selection of three Sapphire cards and a lone ASRock Steel Legend OC variant garnered two Editor's Choice Awards, and two Highly Recommended badges. Sapphire's Radeon RX 7900 GRE Nitro+ was also honored with a "...But Expensive" tag, due to its MSRP of $600—the premium tier design was one of last year's launch day models in China. Western reviewers have latched onto a notable GRE overclocking limitation—all of TPU's review samples were found to have "overclocking artificially limited by AMD." Steve Walton of Hardware Unboxed has investigated whether the GRE's inherent heavily limited power specification was less of an issue on Sapphire's Nitro+ variant—check out his "re-re-review" video below.

The higher board power design—305 W OC TGP limit and 351 W total board power—is expected to exhibit "up to 10% higher performance than Radeon RX 7800 XT" according to VideoCardz, but falls short. TPU's W1zzard found the GRE Nitro+ card's maximum configurable clock of 2803 MHz: "Overclocking worked quite well on our card, we gained over 8% in real-life performance, which is well above what we usually see, but less than other GRE cards tested today. Sapphire's factory OC eats into OC potential, and maximizes performance out of the box instead. Unfortunately AMD restricted overclocking on their card quite a lot, probably to protect sales of the RX 7900 XT. While NVIDIA doesn't have any artificial limitations for overclockers, AMD keeps limiting the slider lengths for many models, this is not a gamer-friendly approach. For the GRE, both GPU and memory overclocking could definitely go higher based on the results that we've seen in our reviews today." An AMD representative has contacted Hardware Unboxed, in reaction to yesterday's Update review—the GRE's overclocking limitation is a "bug," and a fix is in the works. This situation is a bit odd, given that the Golden Rabbit Edition is not a brand-new product.

MiTAC Unleashes Revolutionary Server Solutions, Powering Ahead with 5th Gen Intel Xeon Scalable Processors Accelerated by Intel Data Center GPUs

MiTAC Computing Technology, a subsidiary of MiTAC Holdings Corp., proudly reveals its groundbreaking suite of server solutions that deliver unsurpassed capabilities with the 5th Gen Intel Xeon Scalable Processors. MiTAC introduces its cutting-edge signature platforms that seamlessly integrate the Intel Data Center GPUs, both Intel Max Series and Intel Flex Series, an unparalleled leap in computing performance is unleashed targeting HPC and AI applications.

MiTAC Announce its Full Array of Platforms Supporting the latest 5th Gen Intel Xeon Scalable Processors
Last year, Intel transitioned the right to manufacture and sell products based on Intel Data Center Solution Group designs to MiTAC. MiTAC confidently announces a transformative upgrade to its product offerings, unveiling advanced platforms that epitomize the future of computing. Featured with up to 64 cores, expanded shared cache, increased UPI and DDR5 support, the latest 5th Gen Intel Xeon Scalable Processors deliver remarkable performance per watt gains across various workloads. MiTAC's Intel Server M50FCP Family and Intel Server D50DNP Family fully support the latest 5th Gen Intel Xeon Scalable Processors, made possible through a quick BIOS update and easy technical resource revisions which provide unsurpassed performance to diverse computing environments.

AMD Readying Feature-enriched ROCm 6.1

The latest version of AMD's open-source GPU compute stack, ROCm, is due for launch soon according to a Phoronix article—chief author, Michael Larabel, has been poring over Team Red's public GitHub repositories over the past couple of days. AMD ROCm version 6.0 was released last December—bringing official support for the AMD Instinct MI300A/MI300X, alongside PyTorch improvements, expanded AI libraries, and many other upgrades and optimizations. The v6.0 milestone placed Team Red in a more competitive position next to NVIDIA's very mature CUDA software layer. A mid-February 2024 update added support for Radeon PRO W7800 and RX 7900 GRE GPUs, as well as ONNX Runtime.

Larabel believes that "ROCm 6.1" is in for an imminent release, given his tracking of increased activity on publicly visible developer platforms: "For MIPOpen 3.1 with ROCm 6.1 there's been many additions including new solvers, an AI-based parameter prediction model for the conv_hip_igemm_group_fwd_xdlops solver, numerous fixes, and other updates. AMD MIGraphX will see an important update with ROCm 6.1. For the next ROCm release, MIGraphX 2.9 brings FP8 support, support for more operators, documentation examples for Whisper / Llama-2 / Stable Diffusion 2.1, new ONNX examples, BLAS auto-tuning for GEMMs, and initial code for MIGraphX running on Microsoft Windows." The change-logs/documentation updates also point to several HIPIFY for ROCm 6.1 improvements—including the addition of CUDA 12.3.2 support.

NVIDIA Grace Hopper Systems Gather at GTC

The spirit of software pioneer Grace Hopper will live on at NVIDIA GTC. Accelerated systems using powerful processors - named in honor of the pioneer of software programming - will be on display at the global AI conference running March 18-21, ready to take computing to the next level. System makers will show more than 500 servers in multiple configurations across 18 racks, all packing NVIDIA GH200 Grace Hopper Superchips. They'll form the largest display at NVIDIA's booth in the San Jose Convention Center, filling the MGX Pavilion.

MGX Speeds Time to Market
NVIDIA MGX is a blueprint for building accelerated servers with any combination of GPUs, CPUs and data processing units (DPUs) for a wide range of AI, high performance computing and NVIDIA Omniverse applications. It's a modular reference architecture for use across multiple product generations and workloads. GTC attendees can get an up-close look at MGX models tailored for enterprise, cloud and telco-edge uses, such as generative AI inference, recommenders and data analytics. The pavilion will showcase accelerated systems packing single and dual GH200 Superchips in 1U and 2U chassis, linked via NVIDIA BlueField-3 DPUs and NVIDIA Quantum-2 400 Gb/s InfiniBand networks over LinkX cables and transceivers. The systems support industry standards for 19- and 21-inch rack enclosures, and many provide E1.S bays for nonvolatile storage.

JPR: Total PC GPU Shipments Increased by 6% From Last Quarter and 20% Year-to-Year

Jon Peddie Research reports the growth of the global PC-based graphics processor unit (GPU) market reached 76.2 million units in Q4'23 and PC CPU shipments increased an astonishing 24% year over year, the biggest year-to-year increase in two and a half decades. Overall, GPUs will have a compound annual growth rate of 3.6% during 2024-2026 and reach an installed base of almost 5 billion units at the end of the forecast period. Over the next five years, the penetration of discrete GPUs (dGPUs) in the PC will be 30%.

AMD's overall market share decreased by -1.4% from last quarter, Intel's market share increased 2.8, and Nvidia's market share decreased by -1.36%, as indicated in the following chart.

NVIDIA Accused of Acting as "GPU Cartel" and Controlling Supply

World's most important fuel of the AI frenzy, NVIDIA, is facing accusations of acting as a "GPU cartel" and controlling supply in the data center market, according to statements made by executives at rival chipmaker Groq and former AMD executive Scott Herkelman. In an interview with the Wall Street Journal, Groq CEO Jonathan Ross alleged that some of NVIDIA's data center customers are afraid to even meet with rival AI chipmakers out of fear that NVIDIA will retaliate by delaying shipments of already ordered GPUs. This is despite NVIDIA's claims that it is trying to allocate supply fairly during global shortages. "This happens more than you expect, NVIDIA does this with DC customers, OEMs, AIBs, press, and resellers. They learned from GPP to not put it into writing. They just don't ship after a customer has ordered. They are the GPU cartel, and they control all supply," said former Senior Vice President and General Manager at AMD Radeon, Scott Herkelman, in response to the accusations on X/Twitter.

NVIDIA AI GPU Customers Reportedly Selling Off Excess Hardware

The NVIDIA H100 Tensor Core GPU was last year's hot item for HPC and AI industry segments—the largest purchasers were reported to have acquired up to 150,000 units each. Demand grew so much that lead times of 36 to 52 weeks became the norm for H100-based server equipment. The latest rumblings indicate that things have stabilized—so much so that some organizations are "offloading chips" as the supply crunch cools off. Apparently it is more cost-effective to rent AI processing sessions through cloud service providers (CSPs)—the big three being Amazon Web Services, Google Cloud, and Microsoft Azure.

According to a mid-February Seeking Alpha report, wait times for the NVIDIA H100 80 GB GPU model have been reduced down to around three to four months. The Information believes that some companies have already reduced their order counts, while others have hardware sitting around, completely unused. Maintenance complexity and costs are reportedly cited as a main factors in "offloading" unneeded equipment, and turning to renting server time from CSPs. Despite improved supply conditions, AI GPU demand is still growing—driven mainly by organizations dealing with LLM models. A prime example being Open AI—as pointed out by The Information—insider murmurings have Sam Altman & Co. seeking out alternative solutions and production avenues.

Quantum Machines Launches OPX1000, a High-density Processor-based Control Platform

In Sept. 2023, Quantum Machines (QM) unveiled OPX1000, our most advanced quantum control system to date - and the industry's leading controller in terms of performance and channel density. OPX1000 is the third generation of QM's processor-based quantum controllers. It enhances its predecessor, OPX+, by expanding analog performance and multiplying channel density to support the control of over 1,000 qubits. However, QM's vision for quantum controllers extends far beyond.

OPX1000 is designed as a platform for orchestrating the control of large-scale QPUs (quantum processing units). It's equipped with 8 frontend modules (FEMs) slots, representing the cutting-edge modular architecture for quantum control. The first low-frequency (LF) module was introduced in September 2023, and today, we're happy to introduce the Microwave (MW) FEM, which delivers additional value to our rapidly expanding customer base.

FurMark 2.1 Gets Public Release

FurMark's development team, Geeks3D, seems to be relieved after work was completed on version 2.1.0's public release—according to release notes: "It took me more time than expected but it's there!" The Beta version was made available back in December 2022 (through Geeks3D's Discord)—a milestone achievement for the Furmark dev team, since no major updates had been implemented since 2007. The GPU stress test and benchmarking tool was improved once again—last August, when the Beta was upgraded to v2.0.10.

Main author, JEGX, provided a little bit of background information: "FurMark 2 is built with GeeXLab. The GUI is a pure GeeXLab application while the furmark command line tool is built with the GeeXLab SDK. GeeXLab being cross-platform, this first version of FurMark 2 is available for Windows and Linux (the Linux 32-bit version is also available, I will re-compile it for the next update). I plan to release FurMark 2 for Raspberry Pi (I just received my Raspberry Pi 5 board!) and maybe for macOS too." He states that feedback is welcome, and requests for OpenGL 2.1 and 3.0/3.1 support will be considered. The full timeline of changelog updates can be found here.

Supermicro Accelerates Performance of 5G and Telco Cloud Workloads with New and Expanded Portfolio of Infrastructure Solutions

Supermicro, Inc. (NASDAQ: SMCI), a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, delivers an expanded portfolio of purpose-built infrastructure solutions to accelerate performance and increase efficiency in 5G and telecom workloads. With one of the industry's most diverse offerings, Supermicro enables customers to expand public and private 5G infrastructures with improved performance per watt and support for new and innovative AI applications. As a long-term advocate of open networking platforms and a member of the O-RAN Alliance, Supermicro's portfolio incorporates systems featuring 5th Gen Intel Xeon processors, AMD EPYC 8004 Series processors, and the NVIDIA Grace Hopper Superchip.

"Supermicro is expanding our broad portfolio of sustainable and state-of-the-art servers to address the demanding requirements of 5G and telco markets and Edge AI," said Charles Liang, president and CEO of Supermicro. "Our products are not just about technology, they are about delivering tangible customer benefits. We quickly bring data center AI capabilities to the network's edge using our Building Block architecture. Our products enable operators to offer new capabilities to their customers with improved performance and lower energy consumption. Our edge servers contain up to 2 TB of high-speed DDR5 memory, 6 PCIe slots, and a range of networking options. These systems are designed for increased power efficiency and performance-per-watt, enabling operators to create high-performance, customized solutions for their unique requirements. This reassures our customers that they are investing in reliable and efficient solutions."

NVIDIA Prepared to Offer Custom Chip Designs to AI Clients

NVIDIA is reported to be setting up an AI-focused semi-custom chip design business unit, according to inside sources known to Reuters—it is believed that Team Green leadership is adapting to demands leveraged by key data-center customers. Many companies are seeking cheaper alternatives, or have devised their own designs (budget/war chest permitting)—NVIDIA's current range of AI GPUs are simply off-the-shelf solutions. OpenAI has generated the most industry noise—their alleged early 2024 fund-raising pursuits have attracted plenty of speculative/kind-of-serious interest from notable semiconductor personalities.

Team Green is seemingly reacting to emerging market trends—Jensen Huang (CEO, president and co-founder) has hinted that NVIDIA custom chip designing services are on the cusp. Stephen Nellis—a Reuters reporter specializing in tech industry developments—has highlighted select NVIDIA boss quotes from an incoming interview piece: "We're always open to do that. Usually, the customization, after some discussion, could fall into system reconfigurations or recompositions of systems." The Team Green chief teased that his engineering team is prepared to take on the challenge meeting exact requests: "But if it's not possible to do that, we're more than happy to do a custom chip. And the benefit to the customer, as you can imagine, is really quite terrific. It allows them to extend our architecture with their know-how and their proprietary information." The rumored NVIDIA semi-custom chip design business unit could be introduced in an official capacity at next month's GTC 2024 Conference.

Groq LPU AI Inference Chip is Rivaling Major Players like NVIDIA, AMD, and Intel

AI workloads are split into two different categories: training and inference. While training requires large computing and memory capacity, access speeds are not a significant contributor; inference is another story. With inference, the AI model must run extremely fast to serve the end-user with as many tokens (words) as possible, hence giving the user answers to their prompts faster. An AI chip startup, Groq, which was in stealth mode for a long time, has been making major moves in providing ultra-fast inference speeds using its Language Processing Unit (LPU) designed for large language models (LLMs) like GPT, Llama, and Mistral LLMs. The Groq LPU is a single-core unit based on the Tensor-Streaming Processor (TSP) architecture which achieves 750 TOPS at INT8 and 188 TeraFLOPS at FP16, with 320x320 fused dot product matrix multiplication, in addition to 5,120 Vector ALUs.

Having massive concurrency with 80 TB/s of bandwidth, the Groq LPU has 230 MB capacity of local SRAM. All of this is working together to provide Groq with a fantastic performance, making waves over the past few days on the internet. Serving the Mixtral 8x7B model at 480 tokens per second, the Groq LPU is providing one of the leading inference numbers in the industry. In models like Llama 2 70B with 4096 token context length, Groq can serve 300 tokens/s, while in smaller Llama 2 7B with 2048 tokens of context, Groq LPU can output 750 tokens/s. According to the LLMPerf Leaderboard, the Groq LPU is beating the GPU-based cloud providers at inferencing LLMs Llama in configurations of anywhere from 7 to 70 billion parameters. In token throughput (output) and time to first token (latency), Groq is leading the pack, achieving the highest throughput and second lowest latency.

12V-2X6 "H++" Standard Touted to Safely Deliver 675 W

Online hardware communities continue to discuss the 12VHPWR connection standard's troubled existence, while revised technology gets worked on—quietly—in the background. PCI-SIG's 12V-2x6 connector was first revealed last summer, signalling an alternative power delivery method for high-wattage graphics cards. Past TPU reports show that the 12V-2x6 16-pin design has already popped up on select NVIDIA Founders Edition cards, GeForce RTX 40 SUPER custom graphics card designs, and various new generation power supplies. Earlier today, Алексей (AKA wxnod) took to social media and posted an image of the freshly deshrouded "H++" 12V-2x6 (total design limit: 675 W) socket, as well as a shot of the familiar "H+" 12VHPWR (max. 600 W).

This fifth generation socket design largely rolled out with Team Green's GeForce RTX-40 SUPER card series, although wxnod notes that exceptions do exist: "Some AIC GeForce RTX 4070 SUPER, 4070 Ti SUPER and 4080 SUPER cards are still using the H+12VHPWR interface." The H++ identified 12V-2x6 design's power limit peaks at 675 W—a technical breakdown from last July revealed that 75 W comes from the expansion slot, while the big 600 W portion flows through the 16-pin connector. As mentioned before, 12V-2x6 debuted on a few Non-SUPER cards back in 2023, but last month's SUPER series product launch marked a more comprehensive rollout. AMD has indicated that it is considering an adoption of Gen 5 H++ in the future, but we have not heard much on that subject since last August. A new generation 16-pin PCIe 6.0 power connector design was linked to the upcoming NVIDIA RTX 50-series of "Blackwell" GPUs, but Hardware Busters has refuted rumors generated by Moore's Law is Dead. Team Green is expected to remain faithful to "H++" 12V-2x6 with the launch of next generation graphics cards.

NVIDIA Accelerates Quantum Computing Exploration at Australia's Pawsey Supercomputing Centre

NVIDIA today announced that Australia's Pawsey Supercomputing Research Centre will add the NVIDIA CUDA Quantum platform accelerated by NVIDIA Grace Hopper Superchips to its National Supercomputing and Quantum Computing Innovation Hub, furthering its work driving breakthroughs in quantum computing.

Researchers at the Perth-based center will leverage CUDA Quantum - an open-source hybrid quantum computing platform that features powerful simulation tools, and capabilities to program hybrid CPU, GPU and QPU systems - as well as, the NVIDIA cuQuantum software development kit of optimized libraries and tools for accelerating quantum computing workflows. The NVIDIA Grace Hopper Superchip - which combines the NVIDIA Grace CPU and Hopper GPU architectures - provides extreme performance to run high-fidelity and scalable quantum simulations on accelerators and seamlessly interface with future quantum hardware infrastructure.

EdgeCortix to Showcase Flagship SAKURA-I Chip at Singapore Airshow 2024

EdgeCortix, the Japan-based fabless semiconductor company focused on energy-efficient AI processing, announced today that the Acquisitions, Technology and Logistics Agency (ATLA), Japan Ministry of Defense, will include the groundbreaking edge AI startup alongside an elite group of leading Japanese companies to represent Japan's air and defense innovation landscape at ATLA's booth at the Singapore Airshow to be held February 20 - 25. The Singapore Airshow is one of the largest and most influential shows of its kind in the world, and the largest in Asia, seeing as many as 50,000 attendees per biennial show. Over 1,000 companies from 50 countries are expected to participate in the 2024 show.

EdgeCortix's flagship product, the SAKURA-I chip, will be featured among a small handful of influential Japanese innovations at the booth. SAKURA-I is a dedicated co-processor that delivers high compute efficiency and low latency for artificial intelligence (AI) workloads that are carried out "at the edge", where the data is collected and mission critical decisions need to be made - far away from a datacenter. SAKURA-I delivers orders of magnitude better energy efficiency and processing speed than conventional semiconductors (ex: GPUs & CPUs), while drastically reducing operating costs for end users.

NVIDIA Unveils "Eos" to Public - a Top Ten Supercomputer

Providing a peek at the architecture powering advanced AI factories, NVIDIA released a video that offers the first public look at Eos, its latest data-center-scale supercomputer. An extremely large-scale NVIDIA DGX SuperPOD, Eos is where NVIDIA developers create their AI breakthroughs using accelerated computing infrastructure and fully optimized software. Eos is built with 576 NVIDIA DGX H100 systems, NVIDIA Quantum-2 InfiniBand networking and software, providing a total of 18.4 exaflops of FP8 AI performance. Revealed in November at the Supercomputing 2023 trade show, Eos—named for the Greek goddess said to open the gates of dawn each day—reflects NVIDIA's commitment to advancing AI technology.

Eos Supercomputer Fuels Innovation
Each DGX H100 system is equipped with eight NVIDIA H100 Tensor Core GPUs. Eos features a total of 4,608 H100 GPUs. As a result, Eos can handle the largest AI workloads to train large language models, recommender systems, quantum simulations and more. It's a showcase of what NVIDIA's technologies can do, when working at scale. Eos is arriving at the perfect time. People are changing the world with generative AI, from drug discovery to chatbots to autonomous machines and beyond. To achieve these breakthroughs, they need more than AI expertise and development skills. They need an AI factory—a purpose-built AI engine that's always available and can help ramp their capacity to build AI models at scale Eos delivers. Ranked No. 9 in the TOP 500 list of the world's fastest supercomputers, Eos pushes the boundaries of AI technology and infrastructure.

22 GB Modded GeForce RTX 2080 Ti Cards Listed on Ebay - $499 per unit

An Ebay Store—customgpu_official—is selling memory modified GeForce RTX 2080 Ti graphics cards. The outfit (located in Palo Alto, California) has a large inventory of MSI GeForce RTX 2080 Ti AERO cards—judging from their listing's photo gallery. Workers in China are reportedly upgrading these (possibly refurbished) units with extra lashings of GDDR6 VRAM—going from the original 11 GB specification up to 22 GB. We have observed smaller scale GeForce RTX 2080 Ti modification projects and a very ambitious user-modified example in the past, but customgpu's latest endeavor targets a growth industry—the item description states: "Why do you need a 22 GB 2080 Ti? Large VRAM is essential to cool AIGC apps such as stable diffusion fine tuning, LLAMA, LLM." At the time of writing three cards are available to purchase, and interested customers have already acquired four memory modded units.

They advertise their upgraded "Turbo Edition" card as a great "budget alternative" to more modern GeForce RTX 3090 and 4090 models—"more information and videos" can be accessed via 2080ti22g.com. The MSI GeForce RTX 2080 Ti AERO 11 GB model is not documented within TPU's GPU database, but its dual-slot custom cooling solution is also sported by the MSI RTX 2080 SUPER AERO 8 GB graphics card. The AERO's blower fan system creates a "mini-wind tunnel, pulling fresh air from inside the case and blowing it out the IO panel, and out of the system." The seller's asking price is $499 per unit—perhaps a little bit steep for used cards (potentially involved in mining activities), but customgpu_official seems to be well versed in repairs. Other Ebay listings show non-upgraded MSI GeForce RTX 2080 Ti AERO cards selling in the region of $300 to $400. Custom GPU Upgrade and Repair's hype video proposes that their modified card offers great value, given that it sells for a third of the cost of a GeForce RTX 3090—their Ebay item description contradicts this claim: "only half price compared with GeForce RTX 3090 with almost the same GPU memory."

MSI Afterburner 4.6.6 Beta Ends Windows XP Support

The MSI Afterburner 4.6.6 Beta update was released three days ago—available to download through Guru3D's distribution section—its patch notes tease the exciting addition of "some future NVIDIA GPU PCI DeviceIDs to (our) hardware database." The forward facing nature of this software upgrade brings some unfortunate news for Windows XP operating system users—Beta version 4.6.6's top bullet point provides some reasoning: "Ported to VC++ 2022 compiler. Please take a note that due to this change MSI Afterburner will no longer be able to start under Windows XP. Please stay on the previous versions of the product if you need this OS support." Unwinder's software engineering team has traditionally stuck with the 2008 Visual C++ compiler, hence Afterburner's long history of supporting Windows XP.

The adoption of a more modern compiler has signaled the end for MSI's overclocking and hardware monitoring program on a legacy operating system. Developers largely moved on from XP-supporting endeavors around the mid-2010s—as pointed out by Tom's Hardware: "To get an idea on how late Afterburner is on dropping Windows XP, the last time we reported on any app ending support for the OS was in 2019, when Steam ended support on New Year's Day." Returning to the modern day—4.6.6 Beta's best-of-list mentions that RivaTuner Statistics Server is host to "more than 90 compatibility enhancements and changes"—v7.3.5 rolls out with NVIDIA Reflex and PresentMon integration, as well as programmable conditional layers support. The other headlining feature addition within Afterburner's latest pre-release guise is voltage control for AMD Radeon RX 7800 XT GPUs.

AMD ROCm 6.0 Adds Support for Radeon PRO W7800 & RX 7900 GRE GPUs

Building on our previously announced support of the AMD Radeon RX 7900 XT, XTX and Radeon PRO W7900 GPUs with AMD ROCm 5.7 and PyTorch, we are now expanding our client-based ML Development offering, both from the hardware and software side with AMD ROCm 6.0. Firstly, AI researchers and ML engineers can now also develop on Radeon PRO W7800 and on Radeon RX 7900 GRE GPUs. With support for such a broad product portfolio, AMD is helping the AI community to get access to desktop graphics cards at even more price points and at different performance levels.

Furthermore, we are complementing our solution stack with support for ONNX Runtime. ONNX, short for Open Neural Network Exchange, is an intermediary Machine Learning framework used to convert AI models between different ML frameworks. As a result, users can now perform inference on a wider range of source data on local AMD hardware. This also adds INT8 via MIGraphX—AMD's own graph inference engine—to the available data types (including FP32 and FP16). With AMD ROCm 6.0, we are continuing our support for the PyTorch framework bringing mixed precision with FP32/FP16 to Machine Learning training workflows.

NVIDIA Introduces NVIDIA RTX 2000 Ada Generation GPU

Generative AI is driving change across industries—and to take advantage of its benefits, businesses must select the right hardware to power their workflows. The new NVIDIA RTX 2000 Ada Generation GPU delivers the latest AI, graphics and compute technology to compact workstations, offering up to 1.5x the performance of the previous-generation RTX A2000 12 GB in professional workflows. From crafting stunning 3D environments to streamlining complex design reviews to refining industrial designs, the card's capabilities pave the way for an AI-accelerated future, empowering professionals to achieve more without compromising on performance or capabilities. Modern multi-application workflows, such as AI-powered tools, multi-display setups and high-resolution content, put significant demands on GPU memory. With 16 GB of memory in the RTX 2000 Ada, professionals can tap the latest technologies and tools to work faster and better with their data.

Powered by NVIDIA RTX technology, the new GPU delivers impressive realism in graphics with NVIDIA DLSS, delivering ultra-high-quality, photorealistic ray-traced images more than 3x faster than before. In addition, the RTX 2000 Ada enables an immersive experience for enterprise virtual-reality workflows, such as for product design and engineering design reviews. With its blend of performance, versatility and AI capabilities, the RTX 2000 Ada helps professionals across industries achieve efficiencies. Architects and urban planners can use it to accelerate visualization workflows and structural analysis, enhancing design precision. Product designers and engineers using industrial PCs can iterate rapidly on product designs with fast, photorealistic rendering and AI-powered generative design. Content creators can edit high-resolution videos and images seamlessly, and use AI for realistic visual effects and content creation assistance. And in vital embedded applications and edge computing, the RTX 2000 Ada can power real-time data processing for medical devices, optimize manufacturing processes with predictive maintenance and enable AI-driven intelligence in retail environments.

AMD Develops ROCm-based Solution to Run Unmodified NVIDIA's CUDA Binaries on AMD Graphics

AMD has quietly funded an effort over the past two years to enable binary compatibility for NVIDIA CUDA applications on their ROCm stack. This allows CUDA software to run on AMD Radeon GPUs without adapting the source code. The project responsible is ZLUDA, which was initially developed to provide CUDA support on Intel graphics. The developer behind ZLUDA, Andrzej Janik, was contracted by AMD in 2022 to adapt his project for use on Radeon GPUs with HIP/ROCm. He spent two years bringing functional CUDA support to AMD's platform, allowing many real-world CUDA workloads to run without modification. AMD decided not to productize this effort for unknown reasons but did open-source it once funding ended per their agreement. Over at Phoronix, there were several benchmarks testing AMD's ZLUDA implementation over a wide variety of benchmarks.

Benchmarks found that proprietary CUDA renderers and software worked on Radeon GPUs out-of-the-box with the drop-in ZLUDA library replacements. CUDA-optimized Blender 4.0 rendering now runs faster on AMD Radeon GPUs than the native ROCm/HIP port, reducing render times by around 10-20%, depending on the scene. The implementation is surprisingly robust, considering it was a single-developer project. However, there are some limitations—OptiX and PTX assembly codes still need to be fully supported. Overall, though, testing showed very promising results. Over the generic OpenCL runtimes in Geekbench, CUDA-optimized binaries produce up to 75% better results. With the ZLUDA libraries handling API translation, unmodified CUDA binaries can now run directly on top of ROCm and Radeon GPUs. Strangely, the ZLUDA port targets AMD ROCm 5.7, not the newest 6.x versions. Only time will tell if AMD continues investing in this approach to simplify porting of CUDA software. However, the open-sourced project now enables anyone to contribute and help improve compatibility. For a complete review, check out Phoronix tests.

Widespread GeForce RTX 4080 SUPER Card Shortage Reported in North America

NVIDIA's decision to shave off $200 from its GeForce RTX 4080 GPU tier has caused a run on retail since the launch of SUPER variants late last monthVideoCardz has investigated an apparent North American supply shortage. The adjusted $999 base MSRP appears to be an irresistible prospect for discerning US buyers—today's report explains how: "a week after its release, that GeForce RTX 4080 SUPER cards are not available at any major US retailer for online orders." At the time of writing, no $999 models are available to purchase via e-tailers (for delivery)—BestBuy and Micro Center have a smattering of baseline MSRP cards (including the Founders Edition), but for in-store pickup only. Across the pond, AD103 SUPER's supply status is a bit different: "On the other hand, in Europe, the situation appears to be more favorable, with several retailers listing the cards at or near the MSRP of €1109."

The cheapest custom GeForce RTX 4080 SUPER SKU, at $1123, seems to be listed by Amazon.com. Almost all of Newegg's product pages are displaying an "Out of Stock" notice—ZOTAC GAMING's GeForce RTX 4080 SUPER Trinity OC White Edition model is on "back order" for $1049.99, while the only "in stock" option is MSI's GeForce RTX 4080 Super Expert card (at $1149.99). VideoCardz notes that GeForce RTX 4070 SUPER and RTX 4070 TI SUPER models are in plentiful supply, which highlights a big contrast in market conditions for NVIDIA's latest Ada Lovelace families. The report also mentions an ongoing shortage of GeForce RTX 4080 (Non-SUPER) cards, going back weeks prior to the official January 31 rollout: "Similar to the RTX 4090, finding the RTX 4080 at its $1200 price point has proven challenging." Exact sales figures are not available to media outlets—it is unusual to see official metrics presented a week or two after a product's launch—so we will have to wait a little longer to find out whether demand has far outstripped supply in the USA.

IDC Forecasts Artificial Intelligence PCs to Account for Nearly 60% of All PC Shipments by 2027

A new forecast from International Data Corporation (IDC) shows shipments of artificial intelligence (AI) PCs - personal computers with specific system-on-a-chip (SoC) capabilities designed to run generative AI tasks locally - growing from nearly 50 million units in 2024 to more than 167 million in 2027. By the end of the forecast, IDC expects AI PCs will represent nearly 60% of all PC shipments worldwide.

"As we enter a new year, the hype around generative AI has reached a fever pitch, and the PC industry is running fast to capitalize on the expected benefits of bringing AI capabilities down from the cloud to the client," said Tom Mainelli, group vice president, Devices and Consumer Research. "Promises around enhanced user productivity via faster performance, plus lower inferencing costs, and the benefit of on-device privacy and security, have driven strong IT decision-maker interest in AI PCs. In 2024, we'll see AI PC shipments begin to ramp, and over the next few years, we expect the technology to move from niche to a majority."

CPSC Demands a Recall of CableMod GPU Angled Adapters, Estimates $74.5K of Damaged Property

CableMod issued a statement—just before the last Christmas holiday—detailing a safety recall of 16-pin 12VHPWR angled adapters, version 1.0 and 1.1. This announcement received widespread media coverage (at least in tech circles), but some unfortunate customers have not yet received the memo about faulty adapters—CableMod's 90° angled and 180° hard connectors can overheat and in worst case scenarios, actually melt. HotHardware, amusingly named given this context, was the first hardware news outlet to notice that the Consumer Product Safety Commission (CPSC) had published a "GPU Angled Adapter" recall notice to its website earlier today, under "Recall number 24-112."

The US government body's listing outlines aforementioned hazardous conditions, along with an estimated 25,300 affected unit count. The CPSC's recommended "Remedy" advice is as follows: "Consumers should immediately stop using the recalled angled adapters and contact CableMod for instructions on how to safely remove their adapter from the GPU and for a full refund, including cost of shipping, or a $60 store credit for non-customized products, with free standard shipping. Consumers will be asked to destroy the adapter and upload a photo of the destroyed product to cablemod.com/adapterrecall/. The instructions on how to safely remove the adapter are also located on that site. Once destroyed, consumers should discard the adapter in accordance with local laws." The Safety Commission has gathered some customer feedback intelligence on this matter: "The firm (CableMod Ltd., of China) has received 272 reports of the adapters becoming loose, overheating and melting into the GPU, with at least $74,500 in property damage claims in the United States. No injuries have been reported."
Return to Keyword Browsing
Jun 3rd, 2024 06:50 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts