News Posts matching #Grace CPU

Return to Keyword Browsing

Demand for NVIDIA's Blackwell Platform Expected to Boost TSMC's CoWoS Total Capacity by Over 150% in 2024

NVIDIA's next-gen Blackwell platform, which includes B-series GPUs and integrates NVIDIA's own Grace Arm CPU in models such as the GB200, represents a significant development. TrendForce points out that the GB200 and its predecessor, the GH200, both feature a combined CPU+GPU solution, primarily equipped with the NVIDIA Grace CPU and H200 GPU. However, the GH200 accounted for only approximately 5% of NVIDIA's high-end GPU shipments. The supply chain has high expectations for the GB200, with projections suggesting that its shipments could exceed millions of units by 2025, potentially making up nearly 40 to 50% of NVIDIA's high-end GPU market.

Although NVIDIA plans to launch products such as the GB200 and B100 in the second half of this year, upstream wafer packaging will need to adopt more complex and high-precision CoWoS-L technology, making the validation and testing process time-consuming. Additionally, more time will be required to optimize the B-series for AI server systems in aspects such as network communication and cooling performance. It is anticipated that the GB200 and B100 products will not see significant production volumes until 4Q24 or 1Q25.

Unwrapping the NVIDIA B200 and GB200 AI GPU Announcements

NVIDIA on Monday, at the 2024 GTC conference, unveiled the "Blackwell" B200 and GB200 AI GPUs. These are designed to offer an incredible 5X the AI inferencing performance gain over the current-gen "Hopper" H100, and come with four times the on-package memory. The B200 "Blackwell" is the largest chip physically possible using existing foundry tech, according to its makers. The chip is an astonishing 208 billion transistors, and is made up of two chiplets, which by themselves are the largest possible chips.

Each chiplet is built on the TSMC N4P foundry node, which is the most advanced 4 nm-class node by the Taiwanese foundry. Each chiplet has 104 billion transistors. The two chiplets have a high degree of connectivity with each other, thanks to a 10 TB/s custom interconnect. This is enough bandwidth and latency for the two to maintain cache coherency (i.e. address each other's memory as if they're their own). Each of the two "Blackwell" chiplets has a 4096-bit memory bus, and is wired to 96 GB of HBM3E spread across four 24 GB stacks; which totals to 192 GB for the B200 package. The GPU has a staggering 8 TB/s of memory bandwidth on tap. The B200 package features a 1.8 TB/s NVLink interface for host connectivity, and connectivity to another B200 chip.

NVIDIA Launches Blackwell-Powered DGX SuperPOD for Generative AI Supercomputing at Trillion-Parameter Scale

NVIDIA today announced its next-generation AI supercomputer—the NVIDIA DGX SuperPOD powered by NVIDIA GB200 Grace Blackwell Superchips—for processing trillion-parameter models with constant uptime for superscale generative AI training and inference workloads.

Featuring a new, highly efficient, liquid-cooled rack-scale architecture, the new DGX SuperPOD is built with NVIDIA DGX GB200 systems and provides 11.5 exaflops of AI supercomputing at FP4 precision and 240 terabytes of fast memory—scaling to more with additional racks.

NVIDIA Blackwell Platform Arrives to Power a New Era of Computing

Powering a new era of computing, NVIDIA today announced that the NVIDIA Blackwell platform has arrived—enabling organizations everywhere to build and run real-time generative AI on trillion-parameter large language models at up to 25x less cost and energy consumption than its predecessor.

The Blackwell GPU architecture features six transformative technologies for accelerated computing, which will help unlock breakthroughs in data processing, engineering simulation, electronic design automation, computer-aided drug design, quantum computing and generative AI—all emerging industry opportunities for NVIDIA.

Dell Exec Confirms NVIDIA "Blackwell" B100 Doesn't Need Liquid Cooling

NVIDIA's next-generation AI GPU, the B100 "Blackwell," is now in the hands of the company's biggest ecosystem partners and customers for evaluation, and one of them is Dell. Jeff Clarke, the OEM giant's chief operating officer, speaking to industry analysts in an investor teleconference, said that he is excited about the upcoming B100 and B200 chips from NVIDIA. B100 is codename for the AI GPU NVIDIA designs for PCIe add-on card and the SXM socket, meant for systems powered by x86 CPUs such as the AMD EPYC or Intel Xeon Scalable. The B200 is its variant meant for machines powered by NVIDIA's in-house Arm-based processors, such as the successor to its Grace CPU, and its combination with an AI GPU, called Grace Hopper (GH200).

Perhaps the most interesting remark by Clarke about the B100 is that he doesn't think it needs liquid cooling, and can make do with high-airflow cooling like the H100. "We're excited about what happens at the B100 and the B200, and we think that's where there's actually another opportunity to distinguish engineering confidence. Our characterization in the thermal side, you really don't need to direct-liquid cooling to get to the energy density of 1000 W per GPU. That happens next year with the B200," he said. NVIDIA is planning a 2024 debut for "Blackwell" in the AI GPU space with the B100, with B200 slated for 2025, possibly alongside a new CPU.

NVIDIA GH200 72-core Grace CPU Benched Against AMD Threadripper 7000 Series

GPTshop.ai is building prototypes of their "ultimate high-end desktop supercomputer," running the NVIDIA GH200 "Grace" CPU for AI and HPC workloads. Michael Larabel—founder and principal author of Phoronix—was first allowed to "remote access" a GPTshop.ai GH200 576 GB workstation converted model in early February—for the purpose of benchmarking it against systems based on AMD EPYC Zen 4 and Intel Xeon Emerald Rapids processors. Larabel noted: "it was a very interesting battle" that demonstrated the capabilities of 72 Arm Neoverse-V2 cores (in Grace). With this GPTshop.ai GH200 system actually being in workstation form, I also ran some additional benchmarks looking at the CPU capabilities of the GH200 compared to AMD Ryzen Threadripper 7000 series workstations."

Larabel had on-site access to two different Threadripper systems—a Hewlett-Packard (HP) Z6 G5 A workstation and a System76 Thelio Major semi-custom build. No comparable Intel "Xeon W hardware" was within reach, so the Team Green desktop supercomputer was only pitched against AMD HEDT processors. The HP review sample was configured with an AMD Ryzen Threadripper PRO 7995WX 96-core / 192-thread Zen 4 processor, 8 x 16 GB DDR5-5200 memory, and NVIDIA RTX A4000 GPU. Larabel said that it was an "all around nice high-end AMD workstation." The System76 Thelio Major was specced with an AMD Ryzen Threadripper 7980X processor "as the top-end non-PRO SKU." It is a 64-core / 128-thread part, working alongside 4 x 32 GB DDR5-4800 memory and a Radeon PRO W7900 graphics card.

ASRock Rack Offers World's Smallest NVIDIA Grace Hopper Superchip Server, Other Innovations at MWC 2024

ASRock Rack is a subsidiary of ASRock that deals with servers, workstations, and other data-center hardware, and comes with the enormous brand trust not just of ASRock, but also the firm hand of parent company and OEM giant Pegatron. At the 2024 Mobile World Congress, ASRock Rack introduced several new server innovations relevant to the AI Edge, and 5G cellular carrier industries. A star attraction here is the new ASRock Rack MECAI-GH200, claimed to be the world's smallest server powered by the NVIDIA GH200 Grace Hopper Superchip.

The ASRock Rack MECAI-GH200 executes one of NVIDIA's original design goals behind the Grace Hopper Superchip—AI deployments in an edge environment. The GH200 module combines an NVIDIA Grace CPU with a Hopper AI GPU, and a performance-optimized NVLink interconnect between them. The CPU features 72 Arm Neoverse V2 cores, and 480 GB of LPDDR5X memory; while the Hopper GPU has 132 SM with 528 Tensor cores, and 96 GB of HBM3 memory across a 6144-bit memory interface. Given that the newer HBM3e version of the GH200 won't come out before Q2-2024, this has to be the version with HBM3. What makes the MECAI-GH200 the world's smallest server with the GH200 has to be its compact 2U form-factor—competing solutions tend to be 3U or larger.

Arm Launches Next-Generation Neoverse CSS V3 and N3 Designs for Cloud, HPC, and AI Acceleration

Last year, Arm introduced its Neoverse Compute Subsystem (CSS) for the N2 and V2 series of data center processors, providing a reference platform for the development of efficient Arm-based chips. Major cloud service providers like AWS with Graviton 4 and Trainuium 2, Microsoft with Cobalt 100 and Maia 100, and even NVIDIA with Grace CPU and Bluefield DPUs are already utilizing custom Arm server CPU and accelerator designs based on the CSS foundation in their data centers. The CSS allows hyperscalers to optimize Arm processor designs specifically for their workloads, focusing on efficiency rather than outright performance. Today, Arm has unveiled the next generation CSS N3 and V3 for even greater efficiency and AI inferencing capabilities. The N3 design provides up to 32 high-efficiency cores per die with improved branch prediction and larger caches to boost AI performance by 196%, while the V3 design scales up to 64 cores and is 50% faster overall than previous generations.

Both the N3 and V3 leverage advanced features like DDR5, PCIe 5.0, CXL 3.0, and chiplet architecture, continuing Arm's push to make chiplets the standard for data center and cloud architectures. The chiplet approach enables customers to connect their own accelerators and other chiplets to the Arm cores via UCIe interfaces, reducing costs and time-to-market. Looking ahead, Arm has a clear roadmap for its Neoverse platform. The upcoming CSS V4 "Adonis" and N4 "Dionysus" designs will build on the improvements in the N3 and V3, advancing Arm's goal of greater efficiency and performance using optimized chiplet architectures. As more major data center operators introduce custom Arm-based designs, the Neoverse CSS aims to provide a flexible, efficient foundation to power the next generation of cloud computing.

NVIDIA Accelerates Quantum Computing Exploration at Australia's Pawsey Supercomputing Centre

NVIDIA today announced that Australia's Pawsey Supercomputing Research Centre will add the NVIDIA CUDA Quantum platform accelerated by NVIDIA Grace Hopper Superchips to its National Supercomputing and Quantum Computing Innovation Hub, furthering its work driving breakthroughs in quantum computing.

Researchers at the Perth-based center will leverage CUDA Quantum - an open-source hybrid quantum computing platform that features powerful simulation tools, and capabilities to program hybrid CPU, GPU and QPU systems - as well as, the NVIDIA cuQuantum software development kit of optimized libraries and tools for accelerating quantum computing workflows. The NVIDIA Grace Hopper Superchip - which combines the NVIDIA Grace CPU and Hopper GPU architectures - provides extreme performance to run high-fidelity and scalable quantum simulations on accelerators and seamlessly interface with future quantum hardware infrastructure.

AWS and NVIDIA Partner to Deliver 65 ExaFLOP AI Supercomputer, Other Solutions

Amazon Web Services, Inc. (AWS), an Amazon.com, Inc. company (NASDAQ: AMZN), and NVIDIA (NASDAQ: NVDA) today announced an expansion of their strategic collaboration to deliver the most-advanced infrastructure, software and services to power customers' generative artificial intelligence (AI) innovations. The companies will bring together the best of NVIDIA and AWS technologies—from NVIDIA's newest multi-node systems featuring next-generation GPUs, CPUs and AI software, to AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability—that are ideal for training foundation models and building generative AI applications.

The expanded collaboration builds on a longstanding relationship that has fueled the generative AI era by offering early machine learning (ML) pioneers the compute performance required to advance the state-of-the-art in these technologies.

GIGABYTE Announces New Direct Liquid Cooling (DLC) Multi-Node Servers Ahead of SC23

GIGABYTE Technology, Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, today announced direct liquid cooling (DLC) multi-node servers for NVIDIA Grace CPU & NVIDIA Grace Hopper Superchip. In addition, a DLC ready Intel-based server for the NVIDIA HGX H100 8-GPU platform and a high-density server for AMD EPYC 9004 processors. For the ultimate in efficiency, is also a new 12U single-phase immersion tank. All these mentioned products will be at GIGABYTE booth #355 at SC23.

Just announced high-density CPU servers include Intel Xeon-based H263-S63-LAN1 and AMD EPYC-based H273-Z80-LAN1. These 2U 4 node servers employ DLC for all eight CPUs, and although it is dense computing CPU performance achieves its full potential. In August, GIGABYTE announced new servers for NVIDIA HGX H100 GPU, and now adds the DLC version to the G593 series, G593-SD0-LAX1, for NVIDIA HGX H100 8-GPU.

Supermicro Starts Shipments of NVIDIA GH200 Grace Hopper Superchip-Based Servers

Supermicro, Inc., a Total IT Solution manufacturer for AI, Cloud, Storage, and 5G/Edge, is announcing one of the industry's broadest portfolios of new GPU systems based on the NVIDIA reference architecture, featuring the latest NVIDIA GH200 Grace Hopper and NVIDIA Grace CPU Superchip. The new modular architecture is designed to standardize AI infrastructure and accelerated computing in compact 1U and 2U form factors while providing ultimate flexibility and expansion ability for current and future GPUs, DPUs, and CPUs. Supermicro's advanced liquid-cooling technology enables very high-density configurations, such as a 1U 2-node configuration with 2 NVIDIA GH200 Grace Hopper Superchips integrated with a high-speed interconnect. Supermicro can deliver thousands of rack-scale AI servers per month from facilities worldwide and ensures Plug-and-Play compatibility.

"Supermicro is a recognized leader in driving today's AI revolution, transforming data centers to deliver the promise of AI to many workloads," said Charles Liang, president and CEO of Supermicro. "It is crucial for us to bring systems that are highly modular, scalable, and universal for rapidly evolving AI technologies. Supermicro's NVIDIA MGX-based solutions show that our building-block strategy enables us to bring the latest systems to market quickly and are the most workload-optimized in the industry. By collaborating with NVIDIA, we are helping accelerate time to market for enterprises to develop new AI-enabled applications, simplifying deployment and reducing environmental impact. The range of new servers incorporates the latest industry technology optimized for AI, including NVIDIA GH200 Grace Hopper Superchips, BlueField, and PCIe 5.0 EDSFF slots."

Gigabyte Shows AI/HPC and Data Center Servers at Computex

GIGABYTE is exhibiting cutting-edge technologies and solutions at COMPUTEX 2023, presenting the theme "Future of COMPUTING". From May 30th to June 2nd, GIGABYTE is showcasing over 110 products that are driving future industry transformation, demonstrating the emerging trends of AI technology and sustainability, on the 1st floor, Taipei Nangang Exhibition Center, Hall 1.

GIGABYTE and its subsidiary, Giga Computing, are introducing unparalleled AI/HPC server lineups, leading the era of exascale supercomputing. One of the stars is the industry's first NVIDIA-certified HGX H100 8-GPU SXM5 server, G593-SD0. Equipped with the 4th Gen Intel Xeon Scalable Processors and GIGABYTE's industry-leading thermal design, G593-SD0 can perform extremely intensive workloads from generative AI and deep learning model training within a density-optimized 5U server chassis, making it a top choice for data centers aimed for AI breakthroughs. In addition, GIGABYTE is debuting AI computing servers supporting NVIDIA Grace CPU and Grace Hopper Superchips. The high-density servers are accelerated with NVLink-C2C technology under the ARM Neoverse V2 platform, setting a new standard for AI/HPC computing efficiency and bandwidth.

Giga Computing Goes Big with Green Computing and HPC and AI at Computex

Giga Computing, a subsidiary of GIGABYTE and an industry leader in high-performance servers, server motherboards, and workstations, today announced a major presence at Computex 2023, held May 30 to June 2, with a GIGABYTE booth that inspires while showcasing more than fifty servers that span GIGABYTE's comprehensive enterprise portfolio, including green computing solutions that feature liquid cooled servers and immersion cooling technology. The international computer expo attracts over 100,000 visitors annually and GIGABYTE will be ready with a spacious and attractive booth that will draw in curious minds, and at the same time there will be plenty of knowledgeable staff to answer questions about how our products are being utilized today.

The slogan for Computex 2023 is "Together we create." And just like parts that make a whole, GIGABYTE's slogan of "Future of COMPUTING" embodies all the distinct computing products from consumer to enterprise applications. For the enterprise business unit, there will be sections with themes: "Win Big with AI HPC," "Advance Data Centers," and "Embrace Sustainability." Each theme will show off cutting edge technologies that span x86 and ARM platforms, and great attention is placed on solutions that address challenges that come with more powerful computing.

NVIDIA Grace Drives Wave of New Energy-Efficient Arm Supercomputers

NVIDIA today announced a supercomputer built on the NVIDIA Grace CPU Superchip, adding to a wave of new energy-efficient supercomputers based on the Arm Neoverse platform. The Isambard 3 supercomputer to be based at the Bristol & Bath Science Park, in the U.K., will feature 384 Arm-based NVIDIA Grace CPU Superchips to power medical and scientific research, and is expected to deliver 6x the performance and energy efficiency of Isambard 2, placing it among Europe's most energy-efficient systems.

It will achieve about 2.7 petaflops of FP64 peak performance and consume less than 270 kilowatts of power, ranking it among the world's three greenest non-accelerated supercomputers. The project is being led by the University of Bristol, as part of the research consortium the GW4 Alliance, together with the universities of Bath, Cardiff and Exeter.

NVIDIA Grace CPU Paves Fast Lane to Energy-Efficient Computing for Every Data Center

In tests of real workloads, the NVIDIA Grace CPU Superchip scored 2x performance gains over x86 processors at the same power envelope across major data center CPU applications. That opens up a whole new set of opportunities. It means data centers can handle twice as much peak traffic. They can slash their power bills by as much as half. They can pack more punch into the confined spaces at the edge of their networks - or any combination of the above.

Data center managers need these options to thrive in today's energy-efficient era. Moore's law is effectively dead. Physics no longer lets engineers pack more transistors in the same space at the same power. That's why new x86 CPUs typically offer gains over prior generations of less than 30%. It's also why a growing number of data centers are power capped. With the added threat of global warming, data centers don't have the luxury of expanding their power, but they still need to respond to the growing demands for computing.

NVIDIA Grace CPU Specs Remind Us Why Intel Never Shared x86 with the Green Team

NVIDIA designed the Grace CPU, a processor in the classical sense, to replace the Intel Xeon or AMD EPYC processors it was having to cram into its pre-built HPC compute servers for serial-processing roles, and mainly because those half-a-dozen GPU HPC processors need to be interconnected by a CPU. The company studied the CPU-level limitations and bottlenecks not just with I/O, but also the machine-architecture, and realized its compute servers need a CPU purpose-built for the role, with an architecture that's heavily optimized for NVIDIA's APIs. This, the NVIDIA Grace CPU was born.

This is NVIDIA's first outing with a CPU with a processing footprint rivaling server processors from Intel and AMD. Built on the TSMC N4 (4 nm EUV) silicon fabrication process, it is a monolithic chip that's deployed standalone with an H100 HPC processor on a single board that NVIDIA calls a "Superchip." A board with a Grace and an H100, makes up a "Grace Hopper" Superchip. A board with two Grace CPUs makes a Grace CPU Superchip. Each Grace CPU contains a 900 GB/s switching fabric, a coherent interface, which has seven times the bandwidth of PCI-Express 5.0 x16. This is key to connecting the companion H100 processor, or neighboring Superchips on the node, with coherent memory access.

Taiwan's Tech Titans Adopt World's First NVIDIA Grace CPU-Powered System Designs

NVIDIA today announced that Taiwan's leading computer makers are set to release the first wave of systems powered by the NVIDIA Grace CPU Superchip and Grace Hopper Superchip for a wide range of workloads spanning digital twins, AI, high performance computing, cloud graphics and gaming. Dozens of server models from ASUS, Foxconn Industrial Internet, GIGABYTE, QCT, Supermicro and Wiwynn are expected starting in the first half of 2023. The Grace-powered systems will join x86 and other Arm-based servers to offer customers a broad range of choice for achieving high performance and efficiency in their data centers.

"A new type of data center is emerging—AI factories that process and refine mountains of data to produce intelligence—and NVIDIA is working closely with our Taiwan partners to build the systems that enable this transformation," said Ian Buck, vice president of Hyperscale and HPC at NVIDIA. "These new systems from our partners, powered by our Grace Superchips, will bring the power of accelerated computing to new markets and industries globally."

NVIDIA Claims Grace CPU Superchip is 2X Faster Than Intel Ice Lake

When NVIDIA announced its Grace CPU Superchip, the company officially showed its efforts of creating an HPC-oriented processor to compete with Intel and AMD. The Grace CPU Superchip combines two Grace CPU modules that use the NVLink-C2C technology to deliver 144 Arm v9 cores and 1 TB/s of memory bandwidth. Each core is Arm Neoverse N2 Perseus design, configured to achieve the highest throughput and bandwidth. As far as performance is concerned, the only detail NVIDIA provides on its website is the estimated SPECrate 2017_int_base score of over 740. Thanks to the colleges over at Tom's Hardware, we have another performance figure to look at.

NVIDIA has made a slide about comparison with Intel's Ice Lake server processors. One Grace CPU Superchip was compared to two Xeon Platinum 8360Y Ice Lake CPUs configured in a dual-socket server node. The Grace CPU Superchip outperformed the Ice Lake configuration by two times and provided 2.3 times the efficiency in WRF simulation. This HPC application is CPU-bound, allowing the new Grace CPU to show off. This is all thanks to the Arm v9 Neoverse N2 cores pairing efficiently with outstanding performance. NVIDIA made a graph showcasing all HPC applications running on Arm today, with many more to come, which you can see below. Remember that NVIDIA provides this information, so we have to wait for the 2023 launch to see it in action.

NVIDIA Unveils Grace CPU Superchip with 144 Cores and 1 TB/s Bandwidth

NVIDIA has today announced its Grace CPU Superchip, a monstrous design focused on heavy HPC and AI processing workloads. Previously, team green has teased an in-house developed CPU that is supposed to go into servers and create an entirely new segment for the company. Today, we got a more detailed look at the plan with the Grace CPU Superchip. The Superchip package represents a package of two Grace processors, each containing 72 cores. These cores are based on Arm v9 in structure set architecture iteration and two CPUs total for 144 cores in the Superchip module. These cores are surrounded by a now unknown amount of LPDDR5x with ECC memory, running at 1 TB/s total bandwidth.

NVIDIA Grace CPU Superchip uses the NVLink-C2C cache coherent interconnect, which delivers 900 GB/s bandwidth, seven times more than the PCIe 5.0 protocol. The company targets two-fold performance per Watt improvement over today's CPUs and wants to bring efficiency and performance together. We have some preliminary benchmark information provided by NVIDIA. In the SPECrate2017_int_base integer benchmark, the Grace CPU Superchip scores over 740 points, which is just the simulation for now. This means that the performance target is not finalized yet, teasing a higher number in the future. The company expects to ship the Grace CPU Superchip in the first half of 2023, with an already supported ecosystem of software, including NVIDIA RTX, HPC, NVIDIA AI, and NVIDIA Omniverse software stacks and platforms.
NVIDIA Grace CPU Superchip
Return to Keyword Browsing
Apr 26th, 2024 23:37 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts