Tuesday, March 14th 2023

Microsoft Azure Announces New Scalable Generative AI VMs Featuring NVIDIA H100

Microsoft Azure announced their new ND H100 v5 virtual machine which packs Intel's Sapphire Rapids Xeon Scalable processors with NVIDIA's Hopper H100 GPUs, as well as NVIDIA's Quantum-2 CX7 interconnect. Inside each physical machine sits eight H100s—presumably the SXM5 variant packing a whopping 132 SMs and 528 4th generation tensor cores—interconnected by NVLink 4.0 which ties them all together with 3.6 TB/s bisectional bandwidth. Outside each local machine is a network of thousands more H100s connected together with 400 GB/s Quantum-2 CX7 InfiniBand, which Microsoft says allows 3.2 Tb/s per VM for on-demand scaling to accelerate the largest AI training workloads.

Generative AI solutions like ChatGPT have accelerated demand for multi-ExaOP cloud services that can handle the large training sets and utilize the latest development tools. Azure's new ND H100 v5 VMs offer that capability to organizations of any size, whether you're a smaller startup or a larger company looking to implement large-scale AI training deployments. While Microsoft is not making any direct claims for performance, NVIDIA has advertised H100 as running up to 30x faster than the preceding Ampere architecture that is currently offered with the ND A100 v4 VMs.
Microsoft Azure provides the following technical specifications for the new VMs:
  • 8x NVIDIA H100 Tensor Core GPUs interconnected via next gen NVSwitch and NVLink 4.0
  • 400 Gb/s NVIDIA Quantum-2 CX7 InfiniBand per GPU with 3.2 Tb/s per VM in a non-blocking fat-tree network
  • NVSwitch and NVLink 4.0 with 3.6 TB/s bisectional bandwidth between 8 local GPUs within each VM
  • 4th Gen Intel Xeon Scalable processors
  • PCIE Gen 5 host to GPU interconnect with 64 GB/s bandwidth per GPU
  • 16 Channels of 4800 MHz DDR5 DIMMs
Judging by what we know of NVIDIA Hopper this likely means Microsoft is using either their own racks filled with DGX H100s, or utilizing NVIDIA's DGX SuperPOD which packs the DGX H100s five-high and as many as 16 across for a total 640 GPUs packing 337,920 tensor cores. Don't forget that each DGX H100 also contains two Intel Xeon Scalable processors. Since Microsoft has already specified their systems use Intel's latest Sapphire Rapids Xeons that can feature as many as 60 cores each, than there are potentially 9,600 x86 cores available to help feed those massive GPUs.

Microsoft Azure has opened up the preview of the ND H100 v5 VM service and you can sign up to request access here.
Source: Microsoft
Add your own comment

5 Comments on Microsoft Azure Announces New Scalable Generative AI VMs Featuring NVIDIA H100

#1
Denver
Wow, Xeon still exists and Microsoft insists on using it for some random and unknown reason lol
Posted on Reply
#2
Easo
DenverWow, Xeon still exists and Microsoft insists on using it for some random and unknown reason lol
Uhhh... what now? Is this supposed to be a joke?
Posted on Reply
#3
Jism
Buying large quantities of CPU's kind of guarantees you a discount as well. Intel is known for it.
Posted on Reply
#5
Easo
Denverwww.google.com/amp/s/www.hardwaretimes.com/amds-96-core-epyc-genoa-cpu-is-over-70-faster-than-intels-xeon-sapphire-rapids-flagship-in-2s-mode/amp/

Considering the Xeon loses in pretty much every possible way, I think it's a pretty good joke. Maybe intel has returned to the strategy of giving generous discounts, I hope it doesn't suffer any more lawsuits. :p
Every major purchase at that level has bulk discounts. You can also be sure that this deal was made quite some time before, likely even before that Epyc came out. Also... Let's be honest, 70% difference sreams like cherrypicking at best.
Posted on Reply
May 6th, 2024 07:54 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts