• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Ships World's Most Advanced AI System — NVIDIA DGX A100

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
46,383 (7.68/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
NVIDIA today unveiled NVIDIA DGX A100, the third generation of the world's most advanced AI system, delivering 5 petaflops of AI performance and consolidating the power and capabilities of an entire data center into a single flexible platform for the first time. Immediately available, DGX A100 systems have begun shipping worldwide, with the first order going to the U.S. Department of Energy's (DOE) Argonne National Laboratory, which will use the cluster's AI and computing power to better understand and fight COVID-19.

"NVIDIA DGX A100 is the ultimate instrument for advancing AI," said Jensen Huang, founder and CEO of NVIDIA. "NVIDIA DGX is the first AI system built for the end-to-end machine learning workflow - from data analytics to training to inference. And with the giant performance leap of the new DGX, machine learning engineers can stay ahead of the exponentially growing size of AI models and data."



DGX A100 systems integrate eight of the new NVIDIA A100 Tensor Core GPUs, providing 320 GB of memory for training the largest AI datasets, and the latest high-speed NVIDIA Mellanox HDR 200 Gbps interconnects.

Multiple smaller workloads can be accelerated by partitioning the DGX A100 into as many as 56 instances per system, using the A100 multi-instance GPU feature. Combining these capabilities enables enterprises to optimize computing power and resources on demand to accelerate diverse workloads, including data analytics, training and inference, on a single, fully integrated, software-defined platform.

Immediate DGX A100 Adoption, Support
A number of the world's largest companies, service providers and government agencies have placed initial orders for the DGX A100, with the first systems delivered to Argonne earlier this month.

"We're using America's most powerful supercomputers in the fight against COVID-19, running AI models and simulations on the latest technology available, like the NVIDIA DGX A100," said Rick Stevens, associate laboratory director for Computing, Environment and Life Sciences at Argonne. "The compute power of the new DGX A100 systems coming to Argonne will help researchers explore treatments and vaccines and study the spread of the virus, enabling scientists to do years' worth of AI-accelerated work in months or days."

The University of Florida will be the first institution of higher learning in the U.S. to receive DGX A100 systems, which it will deploy to infuse AI across its entire curriculum to foster an AI-enabled workforce.

"The University of Florida has a vision to be a national leader in artificial intelligence, and NVIDIA is an incredibly valuable partner in our quest to do so," said University of Florida President Kent Fuchs. "Across disciplines, our new NVIDIA DGX A100 systems will position our researchers to solve some of our world's most pressing challenges and equip an entire generation of students with the skills that will revolutionize the future workforce."

Among other early adopters are:
  • The Center for Biomedical AI - at the University Medical Center Hamburg-Eppendorf, Germany - will leverage DGX A100 to advance clinical decision support and process optimization.
  • Chulalongkorn University - Thailand's top research-intensive university - will use DGX A100 to accelerate its pioneering research such as Thai natural language processing, automatic speech recognition, computer vision and medical imaging.
  • Element AI - a Montreal-based developer of AI-powered solutions and services - is deploying DGX A100 to accelerate performance and feature optimization for its Orkestrator GPU scheduler to meet growing AI training and application demands.
  • German Research Center for Artificial Intelligence (DFKI) will use the DGX A100 systems to further accelerate its research on new deep learning methods and their explainability while significantly reducing space and energy consumption.
  • Harrison.ai - a Sydney-based healthcare AI company - will deploy Australia's first DGX A100 systems to accelerate the development of its AI-as-medical-device.
  • The UAE Artificial Intelligence Office - first in the Middle East to deploy the new DGX A100 - is building a national infrastructure to accelerate AI research, development and adoption across the public and private sector.
  • VinAI Research - Vietnam's leading AI research lab, based in Hanoi and Ho Chi Minh City - will use DGX A100 to conduct high-impact research and accelerate the application of AI.
Thousands of previous-generation DGX systems are in use around the globe by a wide range of public and private organizations. Among them are some of the world's leading businesses, including automakers, healthcare providers, retailers, financial institutions and logistics companies that are pushing AI forward across their industries.

NVIDIA Builds Next-Gen 700 Petaflops DGX SuperPOD
NVIDIA also revealed its next-generation DGX SuperPOD, a cluster of 140 DGX A100 systems capable of achieving 700 petaflops of AI computing power. Combining 140 DGX A100 systems with Mellanox HDR 200 Gbps InfiniBand interconnects, NVIDIA built the DGX SuperPOD AI supercomputer for internal research in areas such as conversational AI, genomics and autonomous driving.

The cluster is one of the world's fastest AI supercomputers - achieving a level of performance that previously required thousands of servers. The enterprise-ready architecture and performance of the DGX A100 enabled NVIDIA to build the system in less than a month, instead of taking months or years of planning and procurement of specialized components previously required to deliver these supercomputing capabilities.

To help customers build their own A100-powered data centers, NVIDIA has released a new DGX SuperPOD reference architecture. It gives customers a blueprint that follows the same design principles and best practices NVIDIA used to build its DGX A100-based AI supercomputing cluster.

DGXpert Program, DGX-Ready Software
NVIDIA also launched the NVIDIA DGXpert program, which brings together DGX customers with the company's AI experts; and the NVIDIA DGX-Ready Software program, which helps customers take advantage of certified, enterprise-grade software for AI workflows.

DGXperts are AI-fluent specialists who can help guide clients on AI deployments, from planning to implementation to ongoing optimization. These individuals can help DGX A100 customers build and maintain state-of-the-art AI infrastructure.

The NVIDIA DGX-Ready Software program helps customers quickly identify and take advantage of NVIDIA-tested third-party MLOps software that can help them increase data science productivity, accelerate AI workflows and improve accessibility and utilization of AI infrastructure. The first program partners certified by NVIDIA are Allegro AI, cnvrg.io, Core Scientific, Domino Data Lab, Iguazio and Paperspace.

DGX A100 Technical Specifications
  • Eight NVIDIA A100 Tensor Core GPUs, delivering 5 petaflops of AI power, with 320 GB in total GPU memory with 12.4 TB per second in bandwidth.
  • Six NVIDIA NVSwitch interconnect fabrics with third-generation NVIDIA NVLink technology for 4.8TB per second of bi-directional bandwidth.
  • Nine Mellanox ConnectX-6 HDR 200Gb per second network interfaces, offering a total of 3.6Tb per second of bi-directional bandwidth.
  • Mellanox In-Network Computing and network acceleration engines such as RDMA, GPUDirect and Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) to enable the highest performance and scalability.
  • 15 TB Gen4 NVMe internal storage, which is 2x faster than Gen3 NVMe SSDs.
  • NVIDIA DGX software stack, which includes optimized software for AI and data science workloads, delivering maximized performance and enabling enterprises to achieve a faster return on their investment in AI infrastructure.

A single rack of five DGX A100 systems replaces a data center of AI training and inference infrastructure, with 1/20th the power consumed, 1/25th the space and 1/10th the cost.

Availability
NVIDIA DGX A100 systems start at $199,000 and are shipping now through NVIDIA Partner Network resellers worldwide. Storage technology providers DDN Storage, Dell Technologies, IBM, NetApp, Pure Storage and Vast plan to integrate DGX A100 into their offerings, including those based on the NVIDIA DGX POD and DGX SuperPOD reference architectures.

NVIDIA DGX-Ready Data Center partners offer colocation services in more than 122 locations across 26 countries to help customers seeking cost-effective facilities to host their DGX infrastructure. Customers can take advantage of these services to house and access DGX A100 infrastructure inside validated, world-class data center facilities.

View at TechPowerUp Main Site
 
Joined
May 31, 2017
Messages
877 (0.35/day)
Location
Home
System Name Blackbox
Processor AMD Ryzen 7 3700X
Motherboard Asus TUF B550-Plus WiFi
Cooling Scythe Fuma 2
Memory 2x8GB DDR4 G.Skill FlareX 3200Mhz CL16
Video Card(s) MSI RTX 3060 Ti Gaming Z
Storage Kingston KC3000 1TB + WD SN550 1TB + Samsung 860 QVO 1TB
Display(s) LG 27GP850-B
Case Lian Li O11 Air Mini
Audio Device(s) Logitech Z200
Power Supply Seasonic Focus+ Gold 750W
Mouse Logitech G305
Keyboard MasterKeys Pro S White (MX Brown)
Software Windows 10
Benchmark Scores It plays games.
Also interesting is that these DGX A100s are using dual AMD Epyc 7742 rather than Intel chips.
 
Joined
Jul 1, 2011
Messages
340 (0.07/day)
System Name Matar Extreme PC.
Processor Intel Core i9-10900KF @5.1GHZ All cores Ring@4.6GHZ @1.280v , 24/7
Motherboard Gigabyte Z590 UD , With PCIe X1 Card intel killer 1650x card
Cooling CoolerMaster ML240L V2 AIO with MX6
Memory 4x16 64GB DDR4 3600MHZ CL16-19-19-39 G.SKILL Trident Z NEO
Video Card(s) Nvidia ZOTAC RTX 3080 Ti Trinity OC + overclocked 100 core 1000 mem
Storage WD black 512GB Nvme OS + 1TB 970 Nvme Samsung & 4TB WD Blk 256MB cache 7200RPM
Display(s) Lenovo 34" Ultra Wide 3440x1440 144hz 1ms G-Snyc
Case NZXT H510 Black with Cooler Master RGB Fans
Audio Device(s) Internal , EIFER speakers & EasySMX Wireless Gaming Headset
Power Supply Aurora R9 850Watts 80+ Gold, I Modded cables for it.
Mouse Onn RGB Gaming Mouse & Logitech G923 & shifter & E-Break Sim setup.
Keyboard GOFREETECH RGB Gaming Keyboard, & Xbox 1 X Controller
VR HMD Oculus Rift S
Software Windows 10 Home 22H2
Benchmark Scores https://www.youtube.com/user/matttttar/videos
But can it run Crysis Remastered.
 
Joined
Nov 3, 2011
Messages
690 (0.15/day)
Location
Australia
System Name Eula
Processor AMD Ryzen 9 7900X PBO
Motherboard ASUS TUF Gaming X670E Plus Wifi
Cooling Corsair H115i Elite Capellix XT
Memory Trident Z5 Neo RGB DDR5-6000 64GB (4x16GB F5-6000J3038F16GX2-TZ5NR) EXPO II, OCCT Tested
Video Card(s) Gigabyte GeForce RTX 4080 GAMING OC
Storage Corsair MP600 XT NVMe 2TB, Samsung 980 Pro NVMe 2TB and Toshiba N300 NAS 10TB HDD
Display(s) 2X LG 27UL600 27in 4K HDR FreeSync/G-Sync DP
Case Phanteks Eclipse P500A D-RGB White
Audio Device(s) Creative Sound Blaster Z
Power Supply Corsair HX1000 Platinum 1000W
Mouse SteelSeries Prime Pro Gaming Mouse
Keyboard SteelSeries Apex 5
Software MS Windows 11 Pro
Top