News Posts matching #HPC

Return to Keyword Browsing

Samsung Develops Industry's First High Bandwidth Memory with AI Processing Power

Samsung Electronics, the world leader in advanced memory technology, today announced that it has developed the industry's first High Bandwidth Memory (HBM) integrated with artificial intelligence (AI) processing power—the HBM-PIM. The new processing-in-memory (PIM) architecture brings powerful AI computing capabilities inside high-performance memory, to accelerate large-scale processing in data centers, high performance computing (HPC) systems and AI-enabled mobile applications.

Kwangil Park, senior vice president of Memory Product Planning at Samsung Electronics stated, "Our groundbreaking HBM-PIM is the industry's first programmable PIM solution tailored for diverse AI-driven workloads such as HPC, training and inference. We plan to build upon this breakthrough by further collaborating with AI solution providers for even more advanced PIM-powered applications."

HPE Develops New Spaceborne Computer-2 Computing System for the International Space Station

Hewlett Packard Enterprise (HPE) today announced it is accelerating space exploration and increasing self-sufficiency for astronauts by enabling real-time data processing with advanced commercial edge computing in space for the first time. Astronauts and space explorers aboard the International Space Station (ISS) will speed time-to-insight from months to minutes on various experiments in space, from processing medical imaging and DNA sequencing to unlocking key insights from volumes of remote sensors and satellites, using HPE's Spaceborne Computer-2 (SBC-2), an edge computing system.

Spaceborne Computer-2 is scheduled to launch into orbit on the 15th Northrop Grumman Resupply Mission to Space Station (NG-15) on February 20 and will be available for use on the International Space Station for the next 2-3 years. The NG-15 spacecraft has been named "SS. Katherine Johnson" in honor of Katherine Johnson, a famed Black, female NASA mathematician who was critical to the early success of the space program.

Intel Xe HPC Multi-Chip Module Pictured

Intel SVP for architecture, graphics, and software, Raja Koduri, tweeted the first picture of the Xe HPC scalar compute processor multi-chip module, with its large IHS off. It reveals two large main logic dies built on the 7 nm silicon fabrication process from a third-party foundry. The Xe HPC processor will be targeted at supercomputing and AI-ML applications, so the main logic dies are expected to be large arrays of execution units, spread across what appear to be eight clusters, surrounded by ancillary components such as memory controllers and interconnect PHYs.

There appear to be two kinds of on-package memory on the Xe HPC. The first kind is HBM stacks (from either the HBM2E or HBM3 generation), serving as the main high-speed memory; while the other is a mystery for now. This could either be another class of DRAM, serving a serial processing component on the main logic die; or a non-volatile memory, such as 3D XPoint or NAND flash (likely the former), providing fast persistent storage close to the main logic dies. There appear to be four HBM-class stacks per logic die (so 4096-bit per die and 8192-bit per package), and one die of this secondary memory per logic die.

Tachyum Prodigy Software Emulation Systems Now Available for Pre-Order

Tachyum Inc. today announced that it is signing early adopter customers for the software emulation system for its Prodigy Universal Processor, customers may begin the process of native software development (i.e. using Prodigy Instruction Set Architecture) and porting applications to run on Prodigy. Prodigy software emulation systems will be available at the end of January 2021.

Customers and partners can use Prodigy's software emulation for evaluation, development and debug, and with it, they can begin to transition existing applications that demand high performance and low power to run optimally on Prodigy processors. Pre-built systems include a Prodigy emulator, native Linux, toolchains, compilers, user mode applications, x86, ARM and RISC-V emulators. Software updates will be issued as needed.

NVIDIA is Preparing Co-Packaged Photonics for NVLink

During its GPU Technology Conference (GTC) in China, Mr. Bill Dally—NVIDIA's chief scientist and SVP of research—has presented many interesting things about how the company plans to push the future of HPC, AI, graphics, healthcare, and edge computing. Mr. Dally has presented NVIDIA's research efforts and what is the future vision for its products. Among one of the most interesting things presented was a plan to ditch the standard electrical data transfer and use the speed of light to scale and advance node communication. The new technology utilizing optical data transfer is supposed to bring the power required to transfer by a significant amount.

The proposed plan by the company is to use an optical NVLink equivalent. While the current NVLink 2.0 chip uses eight pico Joules per bit (8 pJ/b) and can send signals only to 0.3 meters without any repeaters, the optical replacement is capable of sending data anywhere from 20 to 100 meters while consuming half the power (4 pJ/b). NVIDIA has conceptualized a system with four GPUs in a tray, all of which are connected by light. To power such a setup, there are lasers that produce 8-10 wavelengths. These wavelengths are modulated onto this at a speed of 25 Gbit/s per wavelength, using ring resonators. On the receiving side, ring photodetectors are used to pick up the wavelength and send it to the photodetector. This technique ensures fast data transfer capable of long distances.

Intel and Argonne Developers Carve Path Toward Exascale 

Intel and Argonne National Laboratory are collaborating on the co-design and validation of exascale-class applications using graphics processing units (GPUs) based on Intel Xe-HP microarchitecture and Intel oneAPI toolkits. Developers at Argonne are tapping into Intel's latest programming environments for heterogeneous computing to ensure scientific applications are ready for the scale and architecture of the Aurora supercomputer at deployment.

"Our close collaboration with Argonne is enabling us to make tremendous progress on Aurora, as we seek to bring exascale leadership to the United States. Providing developers early access to hardware and software environments will help us jumpstart the path toward exascale so that researchers can quickly start taking advantage of the system's massive computational resources." -Trish Damkroger, Intel vice president and general manager of High Performance Computing.

Arm Based Fugaku Supercomputer Retains #1 Top500 Spot

Fugaku—the Arm technology-based supercomputer jointly developed by RIKEN and Fujitsu—was awarded the number one spot on the Top500 list for the second time in a row. This achievement further highlights the rapidly evolving demands of high-performance computing (HPC) that Arm technology uniquely addresses through the unmatched combination of power efficiency, performance, and scalability.

In addition to the great work RIKEN and Fujitsu have done, we're seeing more adoption for Arm-based solutions across our ecosystem. ETRI, the national computing institute of the Republic of Korea, recently announced plans to adopt the upcoming Neoverse V1 (formerly code-named Zeus) CPU design, which feature Arm Scalable Vector Extensions (SVE), for its K-AB21 system. ETRI has set a goal of 16 teraflops per CPU and 1600 teraflops per rack for AB 21 (which stands for 'Artificial Brain 21') while reducing power consumption by 60% compared to its target.

TOP500 Expands Exaflops Capacity Amidst Low Turnover

The 56th edition of the TOP500 saw the Japanese Fugaku supercomputer solidify its number one status in a list that reflects a flattening performance growth curve. Although two new systems managed to make it into the top 10, the full list recorded the smallest number of new entries since the project began in 1993.

The entry level to the list moved up to 1.32 petaflops on the High Performance Linpack (HPL) benchmark, a small increase from 1.23 petaflops recorded in the June 2020 rankings. In a similar vein, the aggregate performance of all 500 systems grew from 2.22 exaflops in June to just 2.43 exaflops on the latest list. Likewise, average concurrency per system barely increased at all, growing from 145,363 cores six months ago to 145,465 cores in the current list.

NVIDIA Announces Mellanox InfiniBand for Exascale AI Supercomputing

NVIDIA today introduced the next generation of NVIDIA Mellanox 400G InfiniBand, giving AI developers and scientific researchers the fastest networking performance available to take on the world's most challenging problems.

As computing requirements continue to grow exponentially in areas such as drug discovery, climate research and genomics, NVIDIA Mellanox 400G InfiniBand is accelerating this work through a dramatic leap in performance offered on the world's only fully offloadable, in-network computing platform. The seventh generation of Mellanox InfiniBand provides ultra-low latency and doubles data throughput with NDR 400 Gb/s and adds new NVIDIA In-Network Computing engines to provide additional acceleration.

AMD Wins Contract for European LUMI Supercomputer: 552 petaflop/s Powered by Epyc, AMD Instinct

AMD has won a contract to empower the LUMI supercomputer, designed for the EuroHPC Joint Undertaking (EuroHPC JU) in conjunction with 10 European countries. The contract will see AMD provide both the CPU and GPU innards of the LUMI, set to be populated with next-generation AMD Epyc CPUs and AMD Instinct GPUs. The supercomputer, which is set to enter operation come next year, will deliver an estimated 552 petaflop/s - higher than the world's current fastest supercomputer, Fugaku in Japan, which reaches peak performance of 513 petaflop/s - and is an Arm-powered affair.

The contract for LUMI's construction has been won by Hewlett Packard Enterprise (HPE), which will be providing an HPE Cray EX supercomputer powered by the aforementioned AMD hardware. LUMI has an investment cost set at 200 million euros, for both hardware, installation, and the foreseeable lifetime of its operation. This design win by AMD marks another big contract for the company, which was all but absent from the supercomputing space until launch, and subsequent iterations, of its Zen architecture and latest generations of Instinct HPC accelerators.

Los Alamos National Laboratory Deploys HPE Cray EX 'Chicoma' Supercomputer Powered by AMD EPYC Processors

Los Alamos National Laboratory has completed the installation of a next-generation high performance computing platform, with aim to enhance its ongoing R&D efforts in support of the nation's response to COVID-19. Named Chicoma, the new platform is poised to demonstrate Hewlett Packard Enterprise's new HPE Cray EX supercomputer architecture for solving complex scientific problems.

"As extensive social and economic impacts from COVID-19 continue to grip the nation, Los Alamos scientists are actively engaged in a number of critical research efforts ranging from therapeutics design to epidemiological modeling," said Irene Qualters, Associate Laboratory Director for Simulation and Computing at Los Alamos. "High Performance Computing is playing a critical role by allowing scientists to model the complex phenomena involved in viral evolution and propagation."

NVIDIA and Atos Team Up to Build World's Fastest AI Supercomputer

NVIDIA today announced that the Italian inter-university consortium CINECA—one of the world's most important supercomputing centers—will use the company's accelerated computing platform to build the world's fastest AI supercomputer.

The new "Leonardo" system, built with Atos, is expected to deliver 10 exaflops of FP16 AI performance to enable advanced AI and HPC converged application use cases. Featuring nearly 14,000 NVIDIA Ampere architecture-based GPUs and NVIDIA Mellanox HDR 200 Gb/s InfiniBand networking, Leonardo will propel Italy as the global leader in AI and high performance computing research and innovation.

Marvell Launches Industry's First Native NVMe RAID Accelerator

Marvell (NASDAQ: MRVL) today introduced the industry's first native NVMe RAID 1 accelerator, a state-of-the-art technology for virtualized, multi-tenant cloud and enterprise data center environments which demand optimized reliability, efficiency, and performance. Hewlett Packard Enterprise (HPE) is the first of Marvell's partners to support the new accelerator in the HPE NS204i-p NVMe OS Boot Device offered on select HPE ProLiant servers and HPE Apollo systems.

As the industry transitions from legacy SAS and SATA to NVMe SSDs, Marvell's offering helps data centers fast-track the move to higher performance flash storage. The innovative accelerator lowers data center total cost of ownership (TCO) by offloading RAID 1 processing from costly and precious server CPU resources, maximizing application processing performance. IT organizations can now deploy a "plug-and-play," NVMe-based OS boot solution, like the HPE NS204i-p NVMe OS Boot Device, that protects the integrity of flash data storage while delivering an optimized, application-level user experience.

Los Alamos National Laboratory Announces new Intel-based Supercomputer Called Crossroads

The Alliance for Computing at Extreme Scale (ACES), a partnership between Los Alamos National Laboratory and Sandia National Laboratories, announced the details of a $105 million contract awarded to Hewlett Packard Enterprise (HPE) to deliver Crossroads, a next-generation supercomputer to be sited at Los Alamos.

"This machine will advance our ability to study the most complex physical systems for science and national security. We look forward to its arrival and deployment," said Jason Pruet, Los Alamos' Program Director for the Advanced Simulating and Computing (ASC) Program.

Arm Announces Next-Generation Neoverse V1 and N2 Cores

Ten years ago, Arm set its sights on deploying its compute-efficient technology in the data center with a vision towards a changing landscape that would require a new approach to infrastructure compute.

That decade-long effort to lay the groundwork for a more efficient infrastructure was realized when we announced Arm Neoverse, a new compute platform that would deliver 30% year-over-year performance improvements through 2021. The unveiling of our first two platforms, Neoverse N1 and E1, was significant and important. Not only because Neoverse N1 shattered our performance target by nearly 2x to deliver 60% more performance when compared to Arm's Cortex-A72 CPU, but because we were beginning to see real demand for more choice and flexibility in this rapidly evolving space.

KIOXIA Bolsters NVMe-oF Ecosystem with Ethernet SSD Storage

Direct-attached performance from network-attached devices is no longer a thing of storage architects' dreams. KIOXIA America, Inc. (formerly Toshiba Memory America, Inc.), is now sampling Ethernet SSDs to select partners and customers interested in validating the benefits of Ethernet attached storage to their existing Ethernet (RoCEv2) networks. KIOXIA has been working in collaboration with key industry players Marvell, Foxconn-Ingrasys and Accton to bring groundbreaking Ethernet Bunch of Flash (EBOF) technology solutions to market - and this announcement is pivotal to that endeavor.

In an ongoing quest to contain explosive amounts of data, storage capacity and bandwidth must continue to grow while processing time must decrease. An EBOF system addresses these challenges through an Ethernet fabric that can scale flash and optimally disaggregate storage from compute. The EBOF storage solution bypasses the cost, complexity, and system limitations inherent with standard JBOF storage systems, which typically include a CPU, DRAM, HBA, and switch. This accelerates applications and workloads where disaggregated low-latency, high bandwidth, highly available storage is needed - bringing greatly improved performance and lower total cost of ownership to edge, enterprise and cloud data centers.

Rambus Advances HBM2E Performance to 4.0 Gbps for AI/ML Training Applications

Rambus Inc. (NASDAQ: RMBS), a premier silicon IP and chip provider making data faster and safer, today announced it has achieved a record 4 Gbps performance with the Rambus HBM2E memory interface solution consisting of a fully-integrated PHY and controller. Paired with the industry's fastest HBM2E DRAM from SK hynix operating at 3.6 Gbps, the solution can deliver 460 GB/s of bandwidth from a single HBM2E device. This performance meets the terabyte-scale bandwidth needs of accelerators targeting the most demanding AI/ML training and high-performance computing (HPC) applications.

"With this achievement by Rambus, designers of AI and HPC systems can now implement systems using the world's fastest HBM2E DRAM running at 3.6 Gbps from SK hynix," said Uksong Kang, vice president of product planning at SK hynix. "In July, we announced full-scale mass-production of HBM2E for state-of-the-art computing applications demanding the highest bandwidth available."

GIGABYTE, Northern Data AG and AMD Join Forces to Drive HPC Mega-Project

GIGABYTE Technology, an industry leader in high-performance servers and workstations, today is announcing a partnership with Northern Data AG to create a HPC mega-project with computing power of around 3.1 exaflops. GIGABYTE will supply GPU-based server systems equipped with proven AMD EPYC processors and AMD Radeon Instinct accelerators from technology partner AMD, a leading provider of high performance computing and graphics technologies, to Northern Data.

Northern Data develops a distributed computing cluster based on the hardware at locations in Norway, Sweden and Canada, which in its final stage of deployment will provide FP32 computing power of around 3.1 exaflops (3.1 million teraflops and 274.54 petaflops FP64). The world's fastest supercomputer, the Japanese "Fukagu" (Fujitsu), has a calculation power of 1.07 exaflops FP32 and 415.3 petaflops FP64, whereas the second fastest, the US supercomputer "Summit" (IBM) has a calculation power of 0.414 exaflops FP32 and 148.0 petaflops FP64.

NVIDIA Fully Absorbs Mellanox Technologies, Now Called NVIDIA Networking

NVIDIA over the weekend formally renamed Mellanox Technologies to NVIDIA Networking. The graphics and scalar computing giant had acquired Mellanox in April 2020, in a deal valued at $7 billion. It is expected that the NVIDIA corporate identity will cover all Mellanox products, including NICs, switches, and interconnect solutions targeted at large-scale data-centers and HPC environments. Mellanox website now defaults to NVIDIA, with the announcement banner "Mellanox Technologies is now NVIDIA Networking." With the acquisition of Mellanox, a potential bid for Softbank's Arm Holdings, and market leadership in the scalar compute industry, NVIDIA moves close to becoming an end-to-end enterprise solution provider.

NVIDIA Ampere GA102-300-A1 GPU Die Pictured

Here's the first picture of an NVIDIA "Ampere" GA102 GPU die. This is the largest client-segment implementation of the "Ampere" architecture by NVIDIA, targeting the gaming (GeForce) and professional-visualization (Quadro) market segments. The "Ampere" architecture itself debuted earlier this year with the A100 Tensor Core scalar processor that's winning hearts and minds in the HPC community faster than ice cream on a dog day afternoon. There's no indication of die-size, but considering how tiny the 10.3 billion-transistor AMD "Navi 10" die is, the GA102 could come with a massive transistor count if its die is as big as that of the TU102. The GPU in the picture is also a qualification sample, and was probably pictured off a prototype graphics card. Powering the GeForce RTX 3090, the GA102-300 is expected to feature a CUDA core count of 5,248. According to VideoCardz, there's a higher trim of this silicon, the GA102-400, which could make it to NVIDIA's next halo product under the TITAN brand.

Intel Releases mOS - Custom Operating System for HPC

Intel has been focusing its resources on data center and high-performance computing lately and the company has made some interesting products. Today, Intel has released its latest creation - mOS operating system. Created as a research project, Intel has made an OS made for some extreme-scale HPC systems, meaning that the OS is created for hyper scalers and ones alike. The goal of mOS is to deliver a high-performance environment for software with low-noise, scalability, and the concept of lightweight kernels (LWK) that manage the system.. Being based on the Linux kernel, the OS is essentially another distribution, however, it has been modified so it fits the HPC ecosystem the best way. The mOS is a product in the pre-alpha phase, however, it can already be used in supercomputers like ASCI Red, IBM Blue Gene, and others. Intel is aiming to develop a stable release by the time the Aurora exascale system is ready so it can deploy mOS there.

GIGABYTE and Northern Data AG Agree to Develop HPC Data Centers

GIGABYTE Technology, an industry leader in high-performance servers and workstations, today is announcing a partnership with German based Northern Data AG, one of the world's largest providers of High-Performance Computing (HPC) solutions, to develop distributed computing in Northern Data's data centers. These data centers with GIGABYTE built HPC servers will be located in the Nordics and managed by Northern Data.

GIGABYTE has more than 30 years of engineering expertise and success stories in the development and production of server solutions that cover a myriad of uses from AI servers to GPU dense servers to scientific computing servers and more. The success that GIGABYTE's servers and workstations have had attributes to the quality and resiliency inherent in the development of manufacturing processes, while also listening to customer needs to produce best-fit solutions.

Intel Optane Persistent Memory DAOS Solution Sets New World Record

Intel Optane persistent memory (PMem), in combination with Intel's open-source distributed asynchronous object storage (DAOS) solution, sets a new world record, soaring to the top of the Virtual Institute for I/O IO-500 list. With just 30 servers of Intel Optane PMem, Intel's DAOS solution defeated today's best supercomputers and now ranks No. 1 for file system performance worldwide. These results validate the solution's delivery as having the most performance of any distributed storage today. They also demonstrate how Intel is truly changing the storage paradigm by providing customers the persistence of disk storage with the fine-grained and low-latency data access of memory in its Intel Optane PMem product.

"The recent IO-500 results for DAOS demonstrate the continuing maturity of the software's functionality enabled by a well-managed code development and testing process. The collaborative development program will continue to deliver additional capabilities for DAOS in support of Argonne's upcoming exascale system, Aurora," said Gordon McPheeters, HPC systems administration specialist at the Argonne Leadership Computing Facility.

GIGABYTE Announces G242-Z11 HPC Node with PCIe 4.0

GIGABYTE Technology,, an industry leader in high-performance servers and workstations, today announced the launch of the GIGABYTE G242-Z11 with PCIe 4.0, which adds to an already extensive line of G242 series servers, designed for AI, deep learning, data analytics, and scientific computing. High-speed interfaces such as Ethernet, Infiniband, and PCI Express rely on fast data transfer, and PCIe 3.0 can pose a bottleneck in some servers. With the expansion of the AMD EPYC family of processors comes PCIe Gen 4.0, which is valuable to servers so as not to bottleneck high bandwidth applications. The 2nd Gen AMD EPYC 7002 processors have added PCIe Gen 4.0, and GIGABYTE has included an ever-evolving line of servers to accommodate the latest technology.

The G242-Z11 caters to the capabilities of 2nd Gen AMD EPYC 7002 series processors. The G242-Z11 is built around a single AMD EPYC processor, and this even includes the new 280 W 64-core (128 threads) AMD EPYC 7H12. Besides a high core count, the 7002 series has 128 PCIe lanes and natively supports PCIe Gen 4.0. It offers double the speed and bandwidth when compared to PCIe 3.0. Having PCIe 4.0 allows for 16GT/s per lane and a total bandwidth of 64 GB/s. As far as memory support, the G242-Z11 has support for 8-channel DDR4 with room for up to 8 DIMMs. In this 1 DIMM per channel configuration, it can support up to 2 TB of memory and speeds up to 3200 MHz.

Chenbro Unveils 2U 8-Bay Rack Mount Server for Data Center

Chenbro has launched the RB23708, a Level 6, 2U rackmount server barebone designed for mission-critical, storage-focused applications in Data Center and HPC Enterprise. The RB23708 is pre-integrated with an Intel Server Board S2600WFTR that supports up to two 2nd Generation Intel Xeon Scalable "Cascade Lake" Processors.

The RB23708 is an easy-to-use barebones server solution that pre-integrates a 2-socket Intel Server Board to ensure a flexible, scalable design with mission-critical reliability. Notably, it offers Apache Pass, IPMI 2.0 & Redfish compliance, and includes Intel RSTe/Intel VROC options, providing an ideal solution for hosting Video, IMS, SaaS and similar storage-focused applications.
Return to Keyword Browsing