• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

CoreWeave Launches Debut Wave of NVIDIA GB200 NVL72-based Cloud Instances

T0@st

News Editor
Joined
Mar 7, 2023
Messages
3,082 (3.89/day)
Location
South East, UK
System Name The TPU Typewriter
Processor AMD Ryzen 5 5600 (non-X)
Motherboard GIGABYTE B550M DS3H Micro ATX
Cooling DeepCool AS500
Memory Kingston Fury Renegade RGB 32 GB (2 x 16 GB) DDR4-3600 CL16
Video Card(s) PowerColor Radeon RX 7800 XT 16 GB Hellhound OC
Storage Samsung 980 Pro 1 TB M.2-2280 PCIe 4.0 X4 NVME SSD
Display(s) Lenovo Legion Y27q-20 27" QHD IPS monitor
Case GameMax Spark M-ATX (re-badged Jonsbo D30)
Audio Device(s) FiiO K7 Desktop DAC/Amp + Philips Fidelio X3 headphones, or ARTTI T10 Planar IEMs
Power Supply ADATA XPG CORE Reactor 650 W 80+ Gold ATX
Mouse Roccat Kone Pro Air
Keyboard Cooler Master MasterKeys Pro L
Software Windows 10 64-bit Home Edition
AI reasoning models and agents are set to transform industries, but delivering their full potential at scale requires massive compute and optimized software. The "reasoning" process involves multiple models, generating many additional tokens, and demands infrastructure with a combination of high-speed communication, memory and compute to ensure real-time, high-quality results. To meet this demand, CoreWeave has launched NVIDIA GB200 NVL72-based instances, becoming the first cloud service provider to make the NVIDIA Blackwell platform generally available. With rack-scale NVIDIA NVLink across 72 NVIDIA Blackwell GPUs and 36 NVIDIA Grace CPUs, scaling to up to 110,000 GPUs with NVIDIA Quantum-2 InfiniBand networking, these instances provide the scale and performance needed to build and deploy the next generation of AI reasoning models and agents.

NVIDIA GB200 NVL72 on CoreWeave
NVIDIA GB200 NVL72 is a liquid-cooled, rack-scale solution with a 72-GPU NVLink domain, which enables the six dozen GPUs to act as a single massive GPU. NVIDIA Blackwell features many technological breakthroughs that accelerate inference token generation, boosting performance while reducing service costs. For example, fifth-generation NVLink enables 130 TB/s of GPU bandwidth in one 72-GPU NVLink domain, and the second-generation Transformer Engine enables FP4 for faster AI performance while maintaining high accuracy. CoreWeave's portfolio of managed cloud services is purpose-built for Blackwell. CoreWeave Kubernetes Service optimizes workload orchestration by exposing NVLink domain IDs, ensuring efficient scheduling within the same rack. Slurm on Kubernetes (SUNK) supports the topology block plug-in, enabling intelligent workload distribution across GB200 NVL72 racks. In addition, CoreWeave's Observability Platform provides real-time insights into NVLink performance, GPU utilization and temperatures.




CoreWeave's GB200 NVL72 instances feature NVIDIA Quantum-2 InfiniBand networking that delivers 400 Gb/s bandwidth per GPU for clusters up to 110,000 GPUs. NVIDIA BlueField-3 DPUs also provide accelerated multi-tenant cloud networking, high-performance data access and GPU compute elasticity for these instances.



Full-Stack Accelerated Computing Platform for Enterprise AI
NVIDIA's full-stack AI platform pairs cutting-edge software with Blackwell-powered infrastructure to help enterprises build fast, accurate and scalable AI agents.

NVIDIA Blueprints provides pre-defined, customizable, ready-to-deploy reference workflows to help developers create real-world applications. NVIDIA NIM is a set of easy-to-use microservices designed for secure, reliable deployment of high-performance AI models for inference. NVIDIA NeMo includes tools for training, customization and continuous improvement of AI models for modern enterprise use cases. Enterprises can use NVIDIA Blueprints, NIM and NeMo to build and fine-tune models for their specialized AI agents.



Bringing Next-Generation AI to the Cloud
The general availability of NVIDIA GB200 NVL72-based instances on CoreWeave underscores the latest in the companies' collaboration, focused on delivering the latest accelerated computing solutions to the cloud. With the launch of these instances, enterprises now have access to the scale and performance needed to power the next wave of AI reasoning models and agents.

These software components, all part of the NVIDIA AI Enterprise software platform, are key enablers to delivering agentic AI at scale and can readily be deployed on CoreWeave.

View at TechPowerUp Main Site | Source
 
Back
Top