News Posts matching #GB200

Return to Keyword Browsing

NVIDIA Blackwell Platform Boosts Water Efficiency by Over 300x - "Chill Factor" for AI Infrastructure

Traditionally, data centers have relied on air cooling—where mechanical chillers circulate chilled air to absorb heat from servers, helping them maintain optimal conditions. But as AI models increase in size, and the use of AI reasoning models rises, maintaining those optimal conditions is not only getting harder and more expensive—but more energy-intensive. While data centers once operated at 20 kW per rack, today's hyperscale facilities can support over 135 kW per rack, making it an order of magnitude harder to dissipate the heat generated by high-density racks. To keep AI servers running at peak performance, a new approach is needed for efficiency and scalability.

One key solution is liquid cooling—by reducing dependence on chillers and enabling more efficient heat rejection, liquid cooling is driving the next generation of high-performance, energy-efficient AI infrastructure. The NVIDIA GB200 NVL72 and the NVIDIA GB300 NVL72 are rack-scale, liquid-cooled systems designed to handle the demanding tasks of trillion-parameter large language model inference. Their architecture is also specifically optimized for test-time scaling accuracy and performance, making it an ideal choice for running AI reasoning models while efficiently managing energy costs and heat.

Huawei CloudMatrix 384 System Outperforms NVIDIA GB200 NVL72

Huawei announced its CloudMatrix 384 system super node, which the company touts as its own domestic alternative to NVIDIA's GB200 NVL72 system, with more overall system performance but worse per-chip performance and higher power consumption. While NVIDIA's GB200 NVL72 uses 36 Grace CPUs paired with 72 "Blackwell" GB200 GPUs, the Huawei CloudMatrix 384 system employs 384 Huawei Ascend 910C accelerators to beat NVIDIA's GB200 NVL72 system. It takes roughly five times more Ascend 910C accelerators to deliver nearly twice the GB200 NVL system performance, which is not good on per-accelerator bias, but excellent on per-system level of deployment. SemiAnalysis argues that Huawei is a generation behind in chip performance but ahead of NVIDIA in scale-up system design and deployment.

When you look at individual chips, NVIDIA's GB200 NVL72 clearly outshines Huawei's Ascend 910C, delivering over three times the BF16 performance (2,500 TeraFLOPS vs. 780 TeraFLOPS), more on‑chip memory (192 GB vs. 128 GB), and faster bandwidth (8 TB/s vs. 3.2 TB/s). In other words, NVIDIA has the raw power and efficiency advantage at the chip level. But flip the switch to the system level, and Huawei's CloudMatrix CM384 takes the lead. It cranks out 1.7× the overall PetaFLOPS, packs in 3.6× more total HBM capacity, and supports over five times the number of GPUs and the associated bandwidth of NVIDIA's NVL72 cluster. However, that scalability does come with a trade‑off, as Huawei's setup draws nearly four times more total power. A single GB200 NVL72 draws 145 kW of power, while a single Huawei CloudMatrix 384 draws ~560 kW. So, NVIDIA is your go-to if you need peak efficiency in a single GPU. If you're building a massive AI supercluster where total throughput and interconnect speed matter most, Huawei's solution actually makes a lot of sense. Thanks to its all-to-all topology, Huawei has delivered an AI training and inference system worth purchasing. When SMIC, the maker of Huawei's chips, gets to a more advanced manufacturing node, the efficiency of these systems will also increase.

Thousands of NVIDIA Grace Blackwell GPUs Now Live at CoreWeave

CoreWeave today became one of the first cloud providers to bring NVIDIA GB200 NVL72 systems online for customers at scale, and AI frontier companies Cohere, IBM and Mistral AI are already using them to train and deploy next-generation AI models and applications. CoreWeave, the first cloud provider to make NVIDIA Grace Blackwell generally available, has already shown incredible results in MLPerf benchmarks with NVIDIA GB200 NVL72 - a powerful rack-scale accelerated computing platform designed for reasoning and AI agents. Now, CoreWeave customers are gaining access to thousands of NVIDIA Blackwell GPUs.

"We work closely with NVIDIA to quickly deliver to customers the latest and most powerful solutions for training AI models and serving inference," said Mike Intrator, CEO of CoreWeave. "With new Grace Blackwell rack-scale systems in hand, many of our customers will be the first to see the benefits and performance of AI innovators operating at scale."

GPUs Could be Exempt from Massive Trump Tariffs Through USMCA Assembly Loophole

High-performance GPUs manufactured in Taiwan could now enter the US market tariff-free through a technical loophole in the United States-Mexico-Canada Agreement (USMCA), found by a research firm SemiAnalysis. Companies can route Taiwan-made GPUs through assembly facilities in Mexico and Canada, effectively circumventing the 32% import duty that would otherwise apply to direct shipments from Taiwan. The exemption hinges on a Most-Favored-Nation clause within the USMCA framework that specifically classifies digital processing units (HTS 8471.50), automatic data processing machine units (HTS 8471.80), and their associated components (HTS 8473.30) as "originating goods." This classification applies regardless of manufacturing origin, creating a duty-free pathway for NVIDIA HGX boards, GB200 baseboards, and RTX GPU cards that undergo final assembly in North American facilities.

The strategy capitalizes on two complementary policy mechanisms. First, President Trump's March 7 executive orders maintained existing USMCA exemptions, preserving the duty-free status for compliant goods from Canada and Mexico. Second, the USMCA's expanded definition of originating products creates a classification framework that treats assembled servers and related components as North American products despite their core manufacturing in Taiwan. For US technology firms, the additional logistical complexity of cross-border assembly operations is offset by eliminating substantial import duties on these high-value components. This practice mirrors established protocols in agricultural imports, where products like Mexican avocados gain preferential treatment under similar origin rules. The global supply chain is adapting quickly, especially in high-margin areas like GPUs, which power AI workloads. We are yet to see how companies set up manufacturing and logistics in the new era of tariff-driven narrative.

MLCommons Releases New MLPerf Inference v5.0 Benchmark Results

Today, MLCommons announced new results for its industry-standard MLPerf Inference v5.0 benchmark suite, which delivers machine learning (ML) system performance benchmarking in an architecture-neutral, representative, and reproducible manner. The results highlight that the AI community is focusing much of its attention and efforts on generative AI scenarios, and that the combination of recent hardware and software advances optimized for generative AI have led to dramatic performance improvements over the past year.

The MLPerf Inference benchmark suite, which encompasses both datacenter and edge systems, is designed to measure how quickly systems can run AI and ML models across a variety of workloads. The open-source and peer-reviewed benchmark suite creates a level playing field for competition that drives innovation, performance, and energy efficiency for the entire industry. It also provides critical technical information for customers who are procuring and tuning AI systems. This round of MLPerf Inference results also includes tests for four new benchmarks: Llama 3.1 405B, Llama 2 70B Interactive for low-latency applications, RGAT, and Automotive PointPainting for 3D object detection.

NVIDIA Blackwell Takes Pole Position in Latest MLPerf Inference Results

In the latest MLPerf Inference V5.0 benchmarks, which reflect some of the most challenging inference scenarios, the NVIDIA Blackwell platform set records - and marked NVIDIA's first MLPerf submission using the NVIDIA GB200 NVL72 system, a rack-scale solution designed for AI reasoning. Delivering on the promise of cutting-edge AI takes a new kind of compute infrastructure, called AI factories. Unlike traditional data centers, AI factories do more than store and process data - they manufacture intelligence at scale by transforming raw data into real-time insights. The goal for AI factories is simple: deliver accurate answers to queries quickly, at the lowest cost and to as many users as possible.

The complexity of pulling this off is significant and takes place behind the scenes. As AI models grow to billions and trillions of parameters to deliver smarter replies, the compute required to generate each token increases. This requirement reduces the number of tokens that an AI factory can generate and increases cost per token. Keeping inference throughput high and cost per token low requires rapid innovation across every layer of the technology stack, spanning silicon, network systems and software.

NVIDIA to Build Accelerated Quantum Computing Research Center

NVIDIA today announced it is building a Boston-based research center to provide cutting-edge technologies to advance quantum computing. The NVIDIA Accelerated Quantum Research Center, or NVAQC, will integrate leading quantum hardware with AI supercomputers, enabling what is known as accelerated quantum supercomputing. The NVAQC will help solve quantum computing's most challenging problems, ranging from qubit noise to transforming experimental quantum processors into practical devices.

Leading quantum computing innovators, including Quantinuum, Quantum Machines and QuEra Computing, will tap into the NVAQC to drive advancements through collaborations with researchers from leading universities, such as the Harvard Quantum Initiative in Science and Engineering (HQI) and the Engineering Quantum Systems (EQuS) group at the Massachusetts Institute of Technology (MIT).

Micron Innovates From the Data Center to the Edge With NVIDIA

Secular growth of AI is built on the foundation of high-performance, high-bandwidth memory solutions. These high-performing memory solutions are critical to unlock the capabilities of GPUs and processors. Micron Technology, Inc., today announced it is the world's first and only memory company shipping both HBM3E and SOCAMM (small outline compression attached memory module) products for AI servers in the data center. This extends Micron's industry leadership in designing and delivering low-power DDR (LPDDR) for data center applications.

Micron's SOCAMM, a modular LPDDR5X memory solution, was developed in collaboration with NVIDIA to support the NVIDIA GB300 Grace Blackwell Ultra Superchip. The Micron HBM3E 12H 36 GB is also designed into the NVIDIA HGX B300 NVL16 and GB300 NVL72 platforms, while the HBM3E 8H 24 GB is available for the NVIDIA HGX B200 and GB200 NVL72 platforms. The deployment of Micron HBM3E products in NVIDIA Hopper and NVIDIA Blackwell systems underscores Micron's critical role in accelerating AI workloads.

NVIDIA Accelerates Science and Engineering With CUDA-X Libraries Powered by GH200 and GB200 Superchips

Scientists and engineers of all kinds are equipped to solve tough problems a lot faster with NVIDIA CUDA-X libraries powered by NVIDIA GB200 and GH200 superchips. Announced today at the NVIDIA GTC global AI conference, developers can now take advantage of tighter automatic integration and coordination between CPU and GPU resources - enabled by CUDA-X working with these latest superchip architectures - resulting in up to 11x speedups for computational engineering tools and 5x larger calculations compared with using traditional accelerated computing architectures.

This greatly accelerates and improves workflows in engineering simulation, design optimization and more, helping scientists and researchers reach groundbreaking results faster. NVIDIA released CUDA in 2006, opening up a world of applications to the power of accelerated computing. Since then, NVIDIA has built more than 900 domain-specific NVIDIA CUDA-X libraries and AI models, making it easier to adopt accelerated computing and driving incredible scientific breakthroughs. Now, CUDA-X brings accelerated computing to a broad new set of engineering disciplines, including astronomy, particle physics, quantum physics, automotive, aerospace and semiconductor design.

Global Top 10 IC Design Houses See 49% YoY Growth in 2024, NVIDIA Commands Half the Market

TrendForce reveals that the combined revenue of the world's top 10 IC design houses reached approximately US$249.8 billion in 2024, marking a 49% YoY increase. The booming AI industry has fueled growth across the semiconductor sector, with NVIDIA leading the charge, posting an astonishing 125% revenue growth, widening its lead over competitors, and solidifying its dominance in the IC industry.

Looking ahead to 2025, advancements in semiconductor manufacturing will further enhance AI computing power, with LLMs continuing to emerge. Open-source models like DeepSeek could lower AI adoption costs, accelerating AI penetration from servers to personal devices. This shift positions edge AI devices as the next major growth driver for the semiconductor industry.

Meta Reportedly Reaches Test Phase with First In-house AI Training Chip

According to a Reuters technology report, Meta's engineering department is engaged in the testing of their "first in-house chip for training artificial intelligence systems." Two inside sources have declared this significant development milestone; involving a small-scale deployment of early samples. The owner of Facebook could ramp up production, upon initial batches passing muster. Despite a recent-ish showcasing of an open-architecture NVIDIA "Blackwell" GB200 system for enterprise, Meta leadership is reported to be pursuing proprietary solutions. Multiple big players—in the field of artificial intelligence—are attempting to breakaway from a total reliance on Team Green. Last month, press outlets concentrated on OpenAI's alleged finalization of an in-house design, with rumored involvement coming from Broadcom and TSMC.

One of the Reuters industry moles believes that Meta has signed up with TSMC—supposedly, the Taiwanese foundry was responsible for the production of test batches. Tom's Hardware reckons that Meta and Broadcom were working together with the tape out of the social media giant's "first AI training accelerator." Development of the company's "Meta Training and Inference Accelerator" (MTIA) series has stretched back a couple of years—according to Reuters, this multi-part project: "had a wobbly start for years, and at one point scrapped a chip at a similar phase of development...Meta last year, started using an MTIA chip to perform inference, or the process involved in running an AI system as users interact with it, for the recommendation systems that determine which content shows up on Facebook and Instagram news feeds." Leadership is reportedly aiming to get custom silicon solutions up and running for AI training by next year. Past examples of MTIA hardware were deployed with open-source RISC-V cores (for inference tasks), but is not clear whether this architecture will form the basis of Meta's latest AI chip design.

Insiders Predict Introduction of NVIDIA "Blackwell Ultra" GB300 AI Series at GTC, with Fully Liquid-cooled Clusters

Supply chain insiders believe that NVIDIA's "Blackwell Ultra" GB300 AI chip design will get a formal introduction at next week's GTC 2025 conference. Jensen Huang's keynote presentation is scheduled—the company's calendar is marked with a very important date: Tuesday, March 18. Team Green's chief has already revealed a couple of Blackwell B300 series details to investors; a recent earnings call touched upon the subject of a second half (of 2025) launch window. Industry moles have put spotlights on the GB300 GPU's alleged energy hungry nature. According to inside tracks, power consumption has "significantly" increased when compared to a slightly older equivalent; NVIDIA's less refined "Blackwell" GB200 design.

A Taiwan Economic Daily news article predicts an upcoming "second cooling revolution," due to reports of "Blackwell Ultra" parts demanding greater heat dissipation solutions. Supply chain leakers have suggested effective countermeasures—in the form of fully liquid-cooled systems: "not only will more water cooling plates be introduced, but the use of water cooling quick connectors will increase four times compared to GB200." The pre-Christmas 2024 news cycle proposed a 1400 W TDP rating. Involved "Taiwanese cooling giants" are expected to pull in tidy sums of money from the supply of optimal heat dissipating gear, with local "water-cooling quick-connector" manufacturers also tipped to benefit greatly. The UDN report pulled quotes from a variety of regional cooling specialists; the consensus being that involved partners are struggling to keep up with demand across GB200 and GB300 product lines.

Finally, Some Good News: GeForce RTX 5090 Supply to Increase in Coming Months

It would be safe to state that the NVIDIA GeForce RTX 5090 and RTX 5080 launch was anything but ideal. Gamers had to deal with whacky NVIDIA marketing material with absurd performance claims, followed by disappointing generational improvement for the RTX 5080, only to be left dealing with abysmal supply leading to obscene shortages and scalper-induced price inflation. However, it does seem like things are about to take a positive turn - NVIDIA is rumored to have ramped up production for its GB202 GPU, which the RTX 5090 is based on, according to a reliable source.

Spotted by VideoCardz, MEGAsizeGPU has claimed that the supply for the GeForce RTX 5090 GPU will soon be "stupidly high", which is absolute music to our ears. In a reply thread, the source further claimed that at least one AIB partner already has "tons of cards", which sure does paint a promising picture for the future. As such, the source expects that the supply will reach customers in about a month, which is to be expected since production has been cranked only recently. Apparently, demand for the GB200 GPU has been lower than usual, forcing NVIDIA to switch to producing GeForce GPUs instead. Of course, the margins for the gaming GPUs are lower, but the production capacity has to go somewhere.

Supermicro Ramps Full Production of NVIDIA Blackwell Rack-Scale Solutions With NVIDIA HGX B200

Supermicro, Inc., a Total IT Solution Provider for AI/ML, HPC, Cloud, Storage, and 5G/Edge, is announcing full production availability of its end-to-end AI data center Building Block Solutions accelerated by the NVIDIA Blackwell platform. The Supermicro Building Block portfolio provides the core infrastructure elements necessary to scale Blackwell solutions with exceptional time to deployment. The portfolio includes a broad range of air-cooled and liquid-cooled systems with multiple CPU options. These include superior thermal design supporting traditional air cooling, liquid-to-liquid (L2L) and liquid-to-air (L2A) cooling. In addition, a full data center management software suite, rack-level integration, including full network switching and cabling and cluster-level L12 solution validation can be delivered as turn-key offering with global delivery, professional support, and service.

"In this transformative moment of AI, where scaling laws are pushing the limits of data center capabilities, our latest NVIDIA Blackwell-powered solutions, developed through close collaboration with NVIDIA, deliver outstanding computational power," said Charles Liang, president and CEO of Supermicro. "Supermicro's NVIDIA Blackwell GPU offerings in plug-and-play scalable units with advanced liquid cooling and air cooling are empowering customers to deploy an infrastructure that supports increasingly complex AI workloads while maintaining exceptional efficiency. This reinforces our commitment to providing sustainable, cutting-edge solutions that accelerate AI innovation."

ASUS AI POD With NVIDIA GB200 NVL72 Platform Ready to Ramp-Up Production for Scheduled Shipment in March

ASUS is proud to announce that ASUS AI POD, featuring the NVIDIA GB200 NVL72 platform, is ready to ramp-up production for a scheduled shipping date of March 2025. ASUS remains dedicated to providing comprehensive end-to-end solutions and software services, encompassing everything from AI supercomputing to cloud services. With a strong focus on fostering AI adoption across industries, ASUS is positioned to empower clients in accelerating their time to market by offering a full spectrum of solutions.

Proof of concept, funded by ASUS
Honoring the commitment to delivering exceptional value to clients, ASUS is set to launch a proof of concept (POC) for the groundbreaking ASUS AI POD, powered by the NVIDIA Blackwell platform. This exclusive opportunity is now open to a select group of innovators who are eager to harness the full potential of AI computing. Innovators and enterprises can experience firsthand the full potential of AI and deep learning solutions at exceptional scale. To take advantage of this limited-time offer, please complete this surveyi at: forms.office.com/r/FrAbm5BfH2. The expert ASUS team of NVIDIA GB200 specialists will guide users through the next steps.

NVIDIA Revises "Blackwell" Architecture Production Roadmap for More Complex Packaging

According to a well-known industry analyst, Ming-Chi Kuo, NVIDIA has restructured its "Blackwell" architecture roadmap, emphasizing dual-die designs using CoWoS-L packaging technology. The new roadmap eliminates several single-die products that would have used CoWoS-S packaging, changing NVIDIA's manufacturing strategy. The 200 Series will exclusively use dual-die designs with CoWoS-L packaging, featuring the GB200 NVL72 and HGX B200 systems. Notably absent is the previously expected B200A single-die variant. The 300 Series will include both dual-die and single-die options, though NVIDIA and cloud providers are prioritizing the GB200 NVL72 dual-die system. Starting Q1 2025, NVIDIA will reduce H series production, which uses CoWoS-S packaging, while ramping up 200 Series production. This transition indicates significantly decreased demand for CoWoS-S capacity through 2025.

While B300 systems using single-die CoWoS-S are planned for 2026 mass production, the current focus remains on dual-die CoWoS-L products. From TSMC's perspective, the transition between Blackwell generations requires minimal process adjustments, as both use similar front-end-of-line processes with only back-end-of-line modifications needed. Supply chain partners heavily dependent on CoWoS-S production face significant impact, reflected in recent stock price corrections. However, NVIDIA maintains this change reflects product strategy evolution rather than market demand weakness. TSMC continues expanding CoWoS-R capacity while slowing CoWoS-S expansion, viewing AI and high-performance computing as sustained growth drivers despite these packaging technology transitions.

NVIDIA's GB200 "Blackwell" Racks Face Overheating Issues

NVIDIA's new GB200 "Blackwell" racks are running into trouble (again). Big cloud companies like Microsoft, Amazon, Google, and Meta Platforms are cutting back their orders because of heat problems, Reuters reports, quoting The Information. The first shipments of racks with Blackwell chips are getting too hot and have connection issues between chips, the report says. These tech hiccups have made some customers who ordered $10 billion or more worth of racks think twice about buying.

Some are putting off their orders until NVIDIA has better versions of the racks. Others are looking at buying older NVIDIA AI chips instead. For example, Microsoft planned to set up GB200 racks with no less than 50,000 Blackwell chips at one of its Phoenix sites. However, The Information reports that OpenAI has asked Microsoft to provide NVIDIA's older "Hopper" chips instead pointing to delays linked to the Blackwell racks. NVIDIA's problems with its Blackwell GPUs housed in high-density racks are not something new; in November 2024, Reuters, also referencing The Information, uncovered overheating issues in servers that housed 72 processors. NVIDIA has made several changes to its server rack designs to tackle these problems, however, it seems that the problem was not entirely solved.

Gigabyte Demonstrates Omni-AI Capabilities at CES 2025

GIGABYTE Technology, internationally renowned for its R&D capabilities and a leading innovator in server and data center solutions, continues to lead technological innovation during this critical period of AI and computing advancement. With its comprehensive AI product portfolio, GIGABYTE will showcase its complete range of AI computing solutions at CES 2025, from data center infrastructure to IoT applications and personal computing, demonstrating how its extensive product line enables digital transformation across all sectors in this AI-driven era.

Powering AI from the Cloud
With AI Large Language Models (LLMs) now routinely featuring parameters in the hundreds of billions to trillions, robust training environments (data centers) have become a critical requirement in the AI race. GIGABYTE offers three distinctive solutions for AI infrastructure.

Gigabyte Expands Its Accelerated Computing Portfolio with New Servers Using the NVIDIA HGX B200 Platform

Giga Computing, a subsidiary of GIGABYTE and an industry leader in generative AI servers and advanced cooling technologies, announced new GIGABYTE G893 series servers using the NVIDIA HGX B200 platform. The launch of these flagship 8U air-cooled servers, the G893-SD1-AAX5 and G893-ZD1-AAX5, signifies a new architecture and platform change for GIGABYTE in the demanding world of high-performance computing and AI, setting new standards for speed, scalability, and versatility.

These servers join GIGABYTE's accelerated computing portfolio alongside the NVIDIA GB200 NVL72 platform, which is a rack-scale design that connects 36 NVIDIA Grace CPUs and 72 NVIDIA Blackwell GPUs. At CES 2025 (January 7-10), the GIGABYTE booth will display the NVIDIA GB200 NVL72, and attendees can engage in discussions about the benefits of GIGABYTE platforms with the NVIDIA Blackwell architecture.

NVIDIA GB300 "Blackwell Ultra" Will Feature 288 GB HBM3E Memory, 1400 W TDP

NVIDIA "Blackwell" series is barely out with B100, B200, and GB200 chips shipping to OEMs and hyperscalers, but the company is already setting in its upgraded "Blackwell Ultra" plans with its upcoming GB300 AI server. According to UDN, the next generation NVIDIA system will be powered by the B300 GPU chip, operating at 1400 W and delivering a remarkable 1.5x improvement in FP4 performance per card compared to its B200 predecessor. One of the most notable upgrades is the memory configuration, with each GPU now sporting 288 GB of HBM3e memory, a substantial increase from the previous 192 GB of GB200. The new design implements a 12-layer stack architecture, advancing from the GB200's 8-layer configuration. The system's cooling infrastructure has been completely reimagined, incorporating advanced water cooling plates and enhanced quick disconnects in the liquid cooling system.

Networking capabilities have also seen a substantial upgrade, with the implementation of ConnectX 8 network cards replacing the previous ConnectX 7 generation, while optical modules have been upgraded from 800G to 1.6T, ensuring faster data transmission. Regarding power management and reliability, the GB300 NVL72 cabinet will standardize capacitor tray implementation, with an optional Battery Backup Unit (BBU) system. Each BBU module costs approximately $300 to manufacture, with a complete GB300 system's BBU configuration totaling around $1,500. The system's supercapacitor requirements are equally substantial, with each NVL72 rack requiring over 300 units, priced between $20-25 per unit during production due to its high-power nature. The GB300, carrying Grace CPU and Blackwell Ultra GPU, also introduces the implementation of LPCAMM on its computing boards, indicating that the LPCAMM memory standard is about to take over servers, not just laptops and desktops. We have to wait for the official launch before seeing LPCAMM memory configurations.

NVIDIA's Next-Gen "Rubin" AI GPU Development 6 Months Ahead of Schedule: Report

The "Rubin" architecture succeeds NVIDIA's current "Blackwell," which powers the company's AI GPUs, as well as the upcoming GeForce RTX 50-series gaming GPUs. NVIDIA will likely not build gaming GPUs with "Rubin," just like it didn't with "Hopper," and for the most part, "Volta." NVIDIA's AI GPU product roadmap put out at SC'24 puts "Blackwell" firmly in charge of the company's AI GPU product stack throughout 2025, with "Rubin" only succeeding it in the following year, for a two-year run in the market, being capped off with a "Rubin Ultra" larger GPU slated for 2027. A new report by United Daily News (UDN), a Taiwan-based publication, says that the development of "Rubin" is running 6 months ahead of schedule.

Being 6 months ahead of schedule doesn't necessarily mean that the product will launch sooner. It would give NVIDIA headroom to get "Rubin" better evaluated in the industry, and make last-minute changes to the product if needed; or even advance the launch if it wants to. The first AI GPU powered by "Rubin" will feature 8-high HBM4 memory stacks. The company will also introduce the "Vera" CPU, the long-awaited successor to "Grace." It will also introduce the X1600 InfiniBand/Ethernet network processor. According to the SC'24 roadmap by NVIDIA, these three would've seen a 2026 launch. Then in 2027, the company would follow up with an even larger AI GPU based on the same "Rubin" architecture, codenamed "Rubin Ultra." This features 12-high HBM4 stacks. NVIDIA's current GB200 "Blackwell" is a tile-based GPU, with two dies that have full cache-coherence. "Rubin" is rumored to feature four tiles.

NVIDIA and Microsoft Showcase Blackwell Preview, Omniverse Industrial AI and RTX AI PCs at Microsoft Ignite

NVIDIA and Microsoft today unveiled product integrations designed to advance full-stack NVIDIA AI development on Microsoft platforms and applications. At Microsoft Ignite, Microsoft announced the launch of the first cloud private preview of the Azure ND GB200 V6 VM series, based on the NVIDIA Blackwell platform. The Azure ND GB200 v6 will be a new AI-optimized virtual machine (VM) series and combines the NVIDIA GB200 NVL72 rack design with NVIDIA Quantum InfiniBand networking.

In addition, Microsoft revealed that Azure Container Apps now supports NVIDIA GPUs, enabling simplified and scalable AI deployment. Plus, the NVIDIA AI platform on Azure includes new reference workflows for industrial AI and an NVIDIA Omniverse Blueprint for creating immersive, AI-powered visuals. At Ignite, NVIDIA also announced multimodal small language models (SLMs) for RTX AI PCs and workstations, enhancing digital human interactions and virtual assistants with greater realism.

NVIDIA Prepares GB200 NVL4: Four "Blackwell" GPUs and Two "Grace" CPUs in a 5,400 W Server

At SC24, NVIDIA announced its latest compute-dense AI accelerators in the form of GB200 NVL4, a single-server solution that expands the company's "Blackwell" series portfolio. The new platform features an impressive combination of four "Blackwell" GPUs and two "Grace" CPUs on a single board. The GB200 NVL4 boasts remarkable specifications for a single-server system, including 768 GB of HBM3E memory across its four Blackwell GPUs, delivering a combined memory bandwidth of 32 TB/s. The system's two Grace CPUs have 960 GB of LPDDR5X memory, making it a powerhouse for demanding AI workloads. A key feature of the NVL4 design is its NVLink interconnect technology, which enables communication between all processors on the board. This integration is important for maintaining optimal performance across the system's multiple processing units, especially during large training runs or inferencing a multi-trillion parameter model.

Performance comparisons with previous generations show significant improvements, with NVIDIA claiming the GB200 GPUs deliver 2.2x faster overall performance and 1.8x quicker training capabilities compared to their GH200 NVL4 predecessor. The system's power consumption reaches 5,400 watts, which effectively doubles the 2,700-watt requirement of the GB200 NVL2 model, its smaller sibling that features two GPUs instead of four. NVIDIA is working closely with OEM partners to bring various Blackwell solutions to market, including the DGX B200, GB200 Grace Blackwell Superchip, GB200 Grace Blackwell NVL2, GB200 Grace Blackwell NVL4, and GB200 Grace Blackwell NVL72. Fitting 5,400 W of TDP in a single server will require liquid cooling for optimal performance, and the GB200 NVL4 is expected to go inside server racks for hyperscaler customers, which usually have a custom liquid cooling systems inside their data centers.

Supermicro Delivers Direct-Liquid-Optimized NVIDIA Blackwell Solutions

Supermicro, Inc., a Total IT Solution Provider for AI, Cloud, Storage, and 5G/Edge, is announcing the highest-performing SuperCluster, an end-to-end AI data center solution featuring the NVIDIA Blackwell platform for the era of trillion-parameter-scale generative AI. The new SuperCluster will significantly increase the number of NVIDIA HGX B200 8-GPU systems in a liquid-cooled rack, resulting in a large increase in GPU compute density compared to Supermicro's current industry-leading liquid-cooled NVIDIA HGX H100 and H200-based SuperClusters. In addition, Supermicro is enhancing the portfolio of its NVIDIA Hopper systems to address the rapid adoption of accelerated computing for HPC applications and mainstream enterprise AI.

"Supermicro has the expertise, delivery speed, and capacity to deploy the largest liquid-cooled AI data center projects in the world, containing 100,000 GPUs, which Supermicro and NVIDIA contributed to and recently deployed," said Charles Liang, president and CEO of Supermicro. "These Supermicro SuperClusters reduce power needs due to DLC efficiencies. We now have solutions that use the NVIDIA Blackwell platform. Using our Building Block approach allows us to quickly design servers with NVIDIA HGX B200 8-GPU, which can be either liquid-cooled or air-cooled. Our SuperClusters provide unprecedented density, performance, and efficiency, and pave the way toward even more dense AI computing solutions in the future. The Supermicro clusters use direct liquid cooling, resulting in higher performance, lower power consumption for the entire data center, and reduced operational expenses."

ASRock Rack Brings End-to-End AI and HPC Server Portfolio to SC24

ASRock Rack Inc., a leading innovative server company, today announces its presence at SC24, held at the Georgia World Congress Center in Atlanta from November 18-21. At booth #3609, ASRock Rack will showcase a comprehensive high-performance portfolio of server boards, systems, and rack solutions with NVIDIA accelerated computing platforms, helping address the needs of enterprises, organizations, and data centers.

Artificial intelligence (AI) and high-performance computing (HPC) continue to reshape technology. ASRock Rack is presenting a complete suite of solutions spanning edge, on-premise, and cloud environments, engineered to meet the demand of AI and HPC. The 2U short-depth MECAI, incorporating the NVIDIA GH200 Grace Hopper Superchip, is developed to supercharge accelerated computing and generative AI in space-constrained environments. The 4U10G-TURIN2 and 4UXGM-GNR2, supporting ten and eight NVIDIA H200 NVL PCIe GPUs respectively, are aiming to help enterprises and researchers tackle every AI and HPC challenge with enhanced performance and greater energy efficiency. NVIDIA H200 NVL is ideal for lower-power, air-cooled enterprise rack designs that require flexible configurations, delivering acceleration for AI and HPC workloads regardless of size.
Return to Keyword Browsing
Apr 25th, 2025 00:07 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts