Monday, May 4th 2020

NERSC Finalizes Contract for Perlmutter Supercomputer Powered by AMD Milan and NVIDIA Volta-Successor

Press Release
The National Energy Research Scientific Computing Center (NERSC), the mission high-performance computing facility for the U.S. Department of Energy's Office of Science, has moved another step closer to making Perlmutter - its next-generation GPU-accelerated supercomputer - available to the science community in 2020.

In mid-April, NERSC finalized its contract with Cray - which was acquired by Hewlett Packard Enterprise (HPE) in September 2019 - for the new system, a Cray Shasta supercomputer that will feature 24 cabinets and provide 3-4 times the capability of NERSC's current supercomputer, Cori. Perlmutter will be deployed at NERSC in two phases: the first set of 12 cabinets, featuring GPU-accelerated nodes, will arrive in late 2020; the second set, featuring CPU-only nodes, will arrive in mid-2021. A 35-petabyte all-flash Lustre-based file system using HPE's ClusterStor E1000 hardware will also be deployed in late 2020.
Since announcing Perlmutter in October 2018, NERSC has been working to fine-tune science applications for GPU technologies and prepare users for the more than 6,000 next-generation NVIDIA GPU processors that will power Perlmutter alongside the heterogeneous system's AMD CPUs. Nearly half of the workload currently running at NERSC is poised to take advantage of GPU acceleration, and NERSC has played a key role in helping the broader scientific community leverage GPU capabilities for their simulation, data processing, and machine learning workloads.

At the core of these efforts is the NERSC Exascale Science Applications Program (NESAP). NESAP partnerships allow projects to collaborate with NERSC and HPC vendors by providing access to early hardware, prototype software tools for performance analysis and optimization, and special training. Over the last 18 months, NESAP teams have been working with NERSC staff and NVIDIA and Cray engineers to accelerate as many codes as possible and ensure that the scientific community can hit the ground running when Perlmutter comes online.

For example, using the NVIDIA Volta GPU processors currently available in Cori, NERSC has been helping users add GPU acceleration to a number of applications and optimize GPU-accelerated code where it already exists, noted Jack Deslippe, who leads NERSC's Application Performance Group.

"We are excited about the progress our applications teams are making optimizing their codes for current and upcoming GPUs," Deslippe said. "Across all of our science areas we are seeing applications where a V100 GPU on Cori is outperforming a CPU Cori node by 5x or greater. These performance gains are the result of work being done by tightly coupled teams of engineers from the applications, NERSC, Cray, and NVIDIA. The enthusiasm for GPUs we are seeing from these teams is encouraging and contagious."

As part of NESAP, in February 2019 NERSC and Cray also began hosting a series of GPU hackathons to help these teams gain knowledge and expertise about GPU programming and apply that knowledge as they port their scientific applications to GPUs. The fifth of 12 scheduled GPU hackathons was held in March at Berkeley Lab.

"These hands-on events are helping ensure that NESAP codes and the broader NERSC workload will be ready to take advantage of the GPUs when Perlmutter arrives," said Brian Friesen, an Application Performance Specialist at NERSC who leads the hackathons. "In some cases, NESAP teams have achieved significant speedups to their applications or key kernels by participating in a hackathon. In other cases, teams have developed proof-of-concept GPU programming methods that will enable them to port their full applications to GPUs."

Meanwhile, NERSC and NVIDIA are collaborating on innovative software tools for Perlmutter's GPU processors, with early versions being tested on the Volta GPUs in Cori:"Giving our users access to the very latest in GPU-accelerated technology this year is an important step towards ensuring that our users remain productive and are able to utilize the systems to prepare for the Exascale era. Our efforts in getting our diverse user base familiar with the new technology has been very encouraging and we look forward to Perlmutter delivering a highly capable user resource for workloads in simulation, learning and data analysis," said Sudip Dosanjh, NERSC Director.

Located at Lawrence Berkeley National Laboratory, NERSC is a DOE Office of Science user facility.
Show 0 Comments