News Posts matching #HPC

Return to Keyword Browsing

Infortrend Launches U.2 NVMe Scale-out NAS Solution

Infortrend Technology, Inc., the industry-leading enterprise storage provider, has launched U.2 SSD solution for EonStor CS scale-out NAS. The new all-flash CS 4014U satisfies high performance-demanding requirements for high throughput and low latency workloads, such as media & entertainment (M&E), HPC, Big Data, etc.

EonStor CS is a scale-out NAS storage able to expand capacity and linearly increase performance by adding more nodes. CS provides complete data protection and high availability to avoid data loss and system downtime caused by disk damage or system failures. Each node of CS 4014U model can be installed with 14 U.2 SSDs, and a 5-node cluster can reach 20 GB/s throughput.

AMD MI200 "Aldebaran" Memory Size of 128GB Per Package Confirmed

The 128 GB per package memory size of AMD's upcoming Instinct MI200 HPC accelerator was confirmed, in a document released by Pawsey SuperComputing Centre, a Perth, Australia-based supercomputing firm that's popular with mineral prospecting companies located there. The company is currently working on Setonix, a 50-petaFLOP supercomputer being put together by HP Enterprise, which combines over 750 next-generation "Aldebaran" GPUs (referenced only as "AMD MI-Next GPUs"); and over 200,000 AMD EPYC "Milan" processor cores (the actual processor package count would be lower, and depend on the various core configs the builder is using).

The Pawsey document mentions 128 GB as the per-GPU memory. This corresponds with the rumored per-package memory of "Aldebaran." Recently imagined by Locuza_, an enthusiast who specializes in annotation of logic silicon dies, "Aldebaran" is a multi-chip module of two logic dies and eight HBM2E stacks. Each of the two logic dies, or chiplets, has 8,192 CDNA2 stream processors that add up to 16,384 on the package; and each of the two dies is wired to four HBM2E stacks over a 4096-bit memory bus. These are 128 Gbit (16 GB) stacks, so we have 64 GB memory per logic die, and 128 GB on the package. Find other drool worthy specs of the Pawsey Setonix in the screengrab below.

AMD CDNA2 "Aldebaran" MI200 HPC Accelerator with 256 CU (16,384 cores) Imagined

AMD Instinct MI200 will be an important product for the company in the HPC and AI supercomputing market. It debuts the CDNA2 compute architecture, and is based on a multi-chip module (MCM) codenamed "Aldebaran." PC enthusiast Locuza, who conjures highly detailed architecture based on public information, imagined what "Aldebaran" could look like. The MCM contains two logic dies, and eight HBM2E stacks. Each of the two dies has a 4096-bit HBM2E interface, which talks to 64 GB of memory (128 GB per package). A silicon interposer provides microscopic wiring among the ten dies.

Each of the two logic dies, or chiplets, has sixteen shader engines that have 16 compute units (CU), each. The CDNA2 compute unit is capable of full-rate FP64, packed FP32 math, and Matrix Engines V2 (fixed function hardware for matrix multiplication, accelerating DNN building, training, and AI inference). With 128 CUs per chiplet, assuming the CDNA2 CU has 64 stream processors, one arrives at 8,192 SP. Two such dies add up to a whopping 16,384, more than three times that of the "Navi 21" RDNA2 silicon. Each die further features its independent PCIe interface, and XGMI (AMD's rival to CXL), an interconnect designed for high-density HPC scenarios. A rudimentary VCN (Video CoreNext) component is also present. It's important to note here, that the CDNA2 CU, as well as the "Aldebaran" MCM itself, doesn't have a dual-use as a GPU, since it lacks much of the hardware needed for graphics processing. The MI200 is expected to launch later this year.

AMD Leads High Performance Computing Towards Exascale and Beyond

At this year's International Supercomputing 2021 digital event, AMD (NASDAQ: AMD) is showcasing momentum for its AMD EPYC processors and AMD Instinct accelerators across the High Performance Computing (HPC) industry. The company also outlined updates to the ROCm open software platform and introduced the AMD Instinct Education and Research (AIER) initiative. The latest Top500 list showcased the continued growth of AMD EPYC processors for HPC systems. AMD EPYC processors power nearly 5x more systems compared to the June 2020 list, and more than double the number of systems compared to November 2020. As well, AMD EPYC processors power half of the 58 new entries on the June 2021 list.

"High performance computing is critical to addressing the world's biggest and most important challenges," said Forrest Norrod, senior vice president and general manager, data center and embedded systems group, AMD. "With our AMD EPYC processor family and Instinct accelerators, AMD continues to be the partner of choice for HPC. We are committed to enabling the performance and capabilities needed to advance scientific discoveries, break the exascale barrier, and continue driving innovation."

Certain Intel Xeon "Sapphire Rapids" SKUs Come with On-Package HBM

Intel today, in its 2021 International Supercomputing Conference presentation, revealed that certain next-generation Xeon "Sapphire Rapids" SKUs come with on-package high-bandwidth memory (HBM). Given the context of its presentation, these could be special SKUs designed for high-density HPC setups, in which the processor package includes certain amount of "PMEM" (package memory), besides the processor's 8-channel DDR5 memory interface.

The size of the HBM PMEM, and its position in the memory hierarchy, were detailed, too. Given its high-density applications, PMEM may not serve as a victim cache for the processor, but rather be capable of serving as main memory, with none of the DDR5 DRAM channels populated with DIMMs. On machines with DIMMs, the PMEM will serve as a victim cache for the processor's on-die last-level cache, accelerating the memory I/O. "The next-generation of Intel Xeon Scalable processors (code-named "Sapphire Rapids) will offer integrated High Bandwidth Memory (HBM), providing a dramatic boost in memory bandwidth and a significant performance improvement for HPC applications that operate memory bandwidth-sensitive workloads. Users can power through workloads using just High Bandwidth Memory or in combination with DDR5," says Intel.

New Intel XPU Innovations Target HPC and AI

At the 2021 International Supercomputing Conference (ISC) Intel is showcasing how the company is extending its lead in high performance computing (HPC) with a range of technology disclosures, partnerships and customer adoptions. Intel processors are the most widely deployed compute architecture in the world's supercomputers, enabling global medical discoveries and scientific breakthroughs. Intel is announcing advances in its Xeon processor for HPC and AI as well as innovations in memory, software, exascale-class storage, and networking technologies for a range of HPC use cases.

"To maximize HPC performance we must leverage all the computer resources and technology advancements available to us," said Trish Damkroger, vice president and general manager of High Performance Computing at Intel. "Intel is the driving force behind the industry's move toward exascale computing, and the advancements we're delivering with our CPUs, XPUs, oneAPI Toolkits, exascale-class DAOS storage, and high-speed networking are pushing us closer toward that realization."

NVIDIA and Global Partners Launch New HGX A100 Systems to Accelerate Industrial AI and HPC

NVIDIA today announced it is turbocharging the NVIDIA HGX AI supercomputing platform with new technologies that fuse AI with high performance computing, making supercomputing more useful to a growing number of industries.

To accelerate the new era of industrial AI and HPC, NVIDIA has added three key technologies to its HGX platform: the NVIDIA A100 80 GB PCIe GPU, NVIDIA NDR 400G InfiniBand networking, and NVIDIA Magnum IO GPUDirect Storage software. Together, they provide the extreme performance to enable industrial HPC innovation.

Intel Makes Changes to Executive Team, Raja got Promoted

Intel CEO Pat Gelsinger announced the addition of two new technology leaders to its executive leadership team, as well as several changes to Intel business units. Current Intel executives Sandra Rivera and Raja Koduri will each take on new senior leadership roles, and technology industry veterans Nick McKeown and Greg Lavender will join the company.

"Since re-joining Intel, I have been impressed with the depth of talent and incredible innovation throughout the company, but we must move faster to fulfill our ambitions," said Gelsinger. "By putting Sandra, Raja, Nick and Greg - with their decades of technology expertise - at the forefront of some of our most essential work, we will sharpen our focus and execution, accelerate innovation, and unleash the deep well of talent across the company."

Tachyum Receives Prodigy FPGA DDR-IO Motherboard to Create Full System Emulation

Tachyum Inc. today announced that it has taken delivery of an IO motherboard for its Prodigy Universal Processor hardware emulator from manufacturing. This provides the company with a complete system prototype integrating CPU, memory, PCI Express, networking and BMC management subsystems when connected to the previously announced field-programmable gate array (FPGA) emulation system board.

The Tachyum Prodigy FPGA DDR-IO Board connects to the Prodigy FPGA CPU Board to provide memory and IO connectivity for the FPGA-based CPU tiles. The fully functional Prodigy emulation system is now ready for further build out, including Linux boot and incorporation of additional test chips. It is available to customers to perform early testing and software development prior to a full four-socket reference design motherboard, which is expected to be available Q4 2021.

AMD Instinct MI200 "Aldebaran" to Launch Later This Year

AMD's next-generation HPC accelerator card, the Instinct MI200, is expected to launch later this year. CEO Dr Lisa Su, speaking at a financial event hosted by JPMorgan stated that the company would launch the next-generation of CDNA architecture this year. The card debuts the company's new CDNA2 compute architecture, and is on its way to supercomputers already announced. The Instinct MI200 HPC accelerator card is based on the new "Aldebaran" compute accelerator package, which is a multi-chip module of not just the compute silicon and memory dies; but one that has multiple compute dies.

Intel "Sapphire Rapids" Xeon Processors Use "Golden Cove" CPU Cores, Company Clarifies in Linux Kernel Dev E-Mail Chain

Intel's upcoming Xeon "Sapphire Rapids" processors which debut in the second half of 2021, will feature up to 80 "Golden Cove" CPU cores, and not the previously rumored "Willow Cove." This was clarified by an Intel developer in a Linux Kernel code e-mail chain. "Golden Cove" CPU cores are more advanced than the "Willow Cove" cores found in current-generation Intel products, such as the client "Tiger Lake" processors. Intel stated that "Golden Cove" introduces an IPC gain over "Willow Cove" (expressed as "ST perf"), increased AI inference performance from an updated GNI component, "network and 5G perf," which is possibly some form of network stack acceleration, and additional security features.

Over in the client segment, the 12th Gen Core "Alder Lake" processor debuts a client variant of "Golden Cove." The "Alder Lake-S" silicon features eight "Golden Cove" cores serving as the "big" performance cores, next to eight "little" low-power "Gracemont" cores. The client- and server implementations of "Golden Cove" could differ mainly in the ISA, with the client chip receiving a slightly skimmed AVX-512 and DLBoost instruction-sets, with only client-relevant instructions. The server variant, in addition being optimized for a high core-count multi-core topology; could feature a more substantial AVX-512 and DLBoost implementation relevant for HPC use-cases.

AMD EPYC 7003 Processors to Power Singapore's Fastest Supercomputer

AMD announced that AMD EPYC 7003 Series processors will be used to power a new supercomputer for the National Supercomputing Centre (NSCC) Singapore, the national high-performance computing (HPC) resource center dedicated to supporting science and engineering computing needs.

The system will be based on the HPE Cray EX supercomputer and will use a combination of the EPYC 7763 and EPYC 75F3 processors. The supercomputer is planned to be fully operational by 2022 and is expected to have a peak theoretical performance of 10 petaFLOPS, 8x faster than NSCC's existing pool of HPC resources. Researchers will use the system to advance scientific research across biomedicine, genomics, diseases, climate, and more.

UK Competition Regulator Probes AMD's Buyout of Xilinx

British competition regulator Competition and Markets Authority (CMA) on Monday, launched an enquiry into the ramifications of AMD's buy-out of FPGA maker Xilinx. The agency is soliciting opinions from the public on whether the $35 billion all-stock purchase will make goods and services less competitive for the UK. Unlike NVIDIA's Arm buyout the Xilinx acquisition is seeing no opposition from tech-giants. The Register notes that AMD could combine Xilinx's FPGAs with its x86 CPU and RDNA SIMD to create highly customizable HPC accelerators. AMD president Dr Lisa Su said "By combining our world-class engineering team and deep domain expertise, we will create an industry leader with the vision, talent and scale to define the future of high performance computing."

Samsung Unveils Industry-First Memory Module Incorporating New CXL Interconnect

Samsung Electronics Co., Ltd., the world leader in advanced memory technology, today unveiled the industry's first memory module supporting the new Compute Express Link (CXL) interconnect standard. Integrated with Samsung's Double Data Rate 5 (DDR5) technology, this CXL-based module will enable server systems to significantly scale memory capacity and bandwidth, accelerating artificial intelligence (AI) and high-performance computing (HPC) workloads in data centers.

The rise of AI and big data has been fueling the trend toward heterogeneous computing, where multiple processors work in parallel to process massive volumes of data. CXL—an open, industry-supported interconnect based on the PCI Express (PCIe) 5.0 interface—enables high-speed, low latency communication between the host processor and devices such as accelerators, memory buffers and smart I/O devices, while expanding memory capacity and bandwidth well beyond what is possible today. Samsung has been collaborating with several data center, server and chipset manufacturers to develop next-generation interface technology since the CXL consortium was formed in 2019.

Intel Ponte Vecchio GPU Scores Another Win in Leibniz Supercomputing Centre

Today, Lenovo in partnership with Intel has announced that Leibniz Supercomputing Centre (LRZ) is building a supercomputer powered by Intel's next-generation technologies. Specifically, the supercomputer will use Intel's Sapphire Rapids CPUs in combination with the highly-teased Ponte Vecchio GPUs to power the applications running at Leibniz Supercomputing Centre. Along with the various processors, the LRZ will also deploy Intel Optane persistent memory to process the huge amount of data the LRZ has and is producing. The integration of HPC and AI processing will be enabled by the expansion of LRZ's current supercomputer called SuperMUG-NG, which will receive an upgrade in 2022, which will feature both Sapphire Rapids and Ponte Vecchio.

Mr. Raja Koduri, Intel graphics guru, has on Twitter teased that this supercomputer installment will represent a combination of Sapphire Rapids, Ponte Vecchio, Optane, and One API all in one machine. The system will use over one petabyte of Distributed Asynchronous Object Storage (DAOS) based on the Optane technologies. Then, Mr. Koduri has teased some Ponte Vecchio eye candy, which is a GIF of tiles combining to form a GPU, which you can check out here. You can also see some pictures of Ponte Vecchio below.
Intel Ponte Vecchio GPU Intel Ponte Vecchio GPU Intel Ponte Vecchio GPU Intel Ponte Vecchio GPU

Samsung Announces Availability of Its Next Generation 2.5D Integration Solution I-Cube4 for High-Performance Applications

Samsung Electronics Co., Ltd., a world leader in advanced semiconductor technology, today announced the immediate availability of its next-generation 2.5D packaging technology Interposer-Cube4 (I-Cube4), leading the evolution of chip packaging technology once again. Samsung's I-CubeTM is a heterogeneous integration technology that horizontally places one or more logic dies (CPU, GPU, etc.) and several High Bandwidth Memory (HBM) dies on top of a silicon interposer, making multiple dies operate as a single chip in one package.

Samsung's new I-Cube4, which incorporates four HBMs and one logic die, was developed in March as the successor of I-Cube2. From high-performance computing (HPC) to AI, 5G, cloud and large data center applications, I-Cube4 is expected to bring another level of fast communication and power efficiency between logic and memory through heterogeneous integration.

Arm Announces Neoverse N2 and V1 Server Platforms

The demands of data center workloads and internet traffic are growing exponentially, and new solutions are needed to keep up with these demands while reducing the current and anticipated growth of power consumption. But the variety of workloads and applications being run today means the traditional one-size-fits all approach to computing is not the answer. The industry demands flexibility; design freedom to achieve the right level of compute for the right application.

As Moore's Law comes to an end, solution providers are seeking specialized processing. Enabling specialized processing has been a focal point since the inception of our Neoverse line of platforms, and we expect these latest additions to accelerate this trend.

Foundry Revenue Projected to Reach Historical High of US$94.6 Billion in 2021 Thanks to High 5G/HPC/End-Device Demand, Says TrendForce

As the global economy enters the post-pandemic era, technologies including 5G, WiFi6/6E, and HPC (high-performance computing) have been advancing rapidly, in turn bringing about a fundamental, structural change in the semiconductor industry as well, according to TrendForce's latest investigations. While the demand for certain devices such as notebook computers and TVs underwent a sharp uptick due to the onset of the stay-at-home economy, this demand will return to pre-pandemic levels once the pandemic has been brought under control as a result of the global vaccination drive. Nevertheless, the worldwide shift to next-gen telecommunication standards has brought about a replacement demand for telecom and networking devices, and this demand will continue to propel the semiconductor industry, resulting in high capacity utilization rates across the major foundries. As certain foundries continue to expand their production capacities this year, TrendForce expects total foundry revenue to reach a historical high of US$94.6 billion this year, an 11% growth YoY.

Intel CEO on NVIDIA CPUs: They Are Responding to Us

NVIDIA has recently announced the company's first standalone Grace CPU that will come out as a product in 2023. NVIDIA has designed Grace on Arm ISA, likely ARM v9, to represent a new way that data centers are built and deliver a whole new level of HPC and AI performance. However, the CPU competition in a data center space is considered one of the hardest markets to enter. Usually, the market is a duopoly between Intel and AMD, which supply x86 processors to server vendors. In the past few years, there have been few Arm CPUs that managed to enter the data canter space, however, NVIDIA is aiming to deliver much more performance and grab a bigger piece of the market.

As a self-proclaimed leader in AI, Intel is facing hard competition from NVIDIA in the coming years. In an interview with Fortune, Intel's new CEO Pat Gelsinger has talked about NVIDIA and how the company sees the competition between the two. Mr. Gelsinger is claiming that Intel is a leader in CPUs that feature AI acceleration built in the chip and that they are not playing defense, but rather offense against NVIDIA. You can check out the whole quote from the interview below.

KIOXIA PCIe 4.0 NVMe SSDs Now Qualified for NVIDIA Magnum IO GPUDirect Storage

KIOXIA today announced that its lineup of CM6 Series PCIe 4.0 enterprise NVMe SSDs has been successfully tested and certified to support NVIDIA's Magnum IO GPUDirect Storage. Modern AI and data science applications are synonymous with massive datasets - as are the storage requirements that go along with them. Part of the NVIDIA Magnum IO subsystem designed for GPU-accelerated compute environments, NVIDIA Magnum IO GPUDirect Storage allows the GPU to bypass the CPU and communicate directly with NVMe SSD storage. This improves overall system performance while reducing the impact on host CPU and memory resources. Through rigorous testing conducted by NVIDIA, KIOXIA's CM6 drives have been confirmed to meet the demanding storage requirements of GPU-intensive applications.

Large AI/ML, HPC modeling and data analytics datasets need to be moved and processed in real-time, pushing performance requirements through the roof," said Neville Ichhaporia, vice president, SSD marketing and product management for KIOXIA America, Inc. "By delivering speeds up to 16.0 gigatransfers per second throughput per lane, our CM6 Series SSDs enable NVIDIA's Magnum IO GPUDirect Storage to work with increasingly large and distributed datasets, thereby improving overall application performance and providing a path to scaling dataset sizes even further."

OpenFive Tapes Out SoC for Advanced HPC/AI Solutions on TSMC 5 nm Technology

OpenFive, a leading provider of customizable, silicon-focused solutions with differentiated IP, today announced the successful tape out of a high-performance SoC on TSMC's N5 process, with integrated IP solutions targeted for cutting edge High Performance Computing (HPC)/AI, networking, and storage solutions.

The SoC features an OpenFive High Bandwidth Memory (HBM3) IP subsystem and D2D I/Os, as well as a SiFive E76 32-bit CPU core. The HBM3 interface supports 7.2 Gbps speeds allowing high throughput memories to feed domain-specific accelerators in compute-intensive applications including HPC, AI, Networking, and Storage. OpenFive's low-power, low-latency, and highly scalable D2D interface technology allows for expanding compute performance by connecting multiple dice together using an organic substrate or a silicon interposer in a 2.5D package.

NVIDIA Announces Grace CPU for Giant AI and High Performance Computing Workloads

NVIDIA today announced its first data center CPU, an Arm-based processor that will deliver 10x the performance of today's fastest servers on the most complex AI and high performance computing workloads.

The result of more than 10,000 engineering years of work, the NVIDIA Grace CPU is designed to address the computing requirements for the world's most advanced applications—including natural language processing, recommender systems and AI supercomputing—that analyze enormous datasets requiring both ultra-fast compute performance and massive memory. It combines energy-efficient Arm CPU cores with an innovative low-power memory subsystem to deliver high performance with great efficiency.

Intel Announces 10 nm Third Gen Xeon Scalable Processors "Ice Lake"

Intel today launched its most advanced, highest performance data center platform optimized to power the industry's broadest range of workloads—from the cloud to the network to the intelligent edge. New 3rd Gen Intel Xeon Scalable processors (code-named "Ice Lake") are the foundation of Intel's data center platform, enabling customers to capitalize on some of the most significant business opportunities today by leveraging the power of AI.

New 3rd Gen Intel Xeon Scalable processors deliver a significant performance increase compared with the prior generation, with an average 46% improvement on popular data center workloads. The processors also add new and enhanced platform capabilities including Intel SGX for built-in security, and Intel Crypto Acceleration and Intel DL Boost for AI acceleration. These new capabilities, combined with Intel's broad portfolio of Intel Select Solutions and Intel Market Ready Solutions, enable customers to accelerate deployments across cloud, AI, enterprise, HPC, networking, security and edge applications.

Raja Koduri Teases "Petaflops in Your Palm" Intel Xe-HPC Ponte Vecchio GPU

Raja Koduri of Intel has today posted an interesting video on his Twitter account. Showing one of the greatest engineering marvels Intel has ever created, Mr. Koduri has teased what is to come when the company launches the Xe-HPC Ponte Vecchio graphics card designed for high-performance computing workloads. Showcased today was the "petaflops in your palm" chip, designed to run AI workloads with a petaflop of computing power. Having over 100 billion transistors, the chip uses as much as 47 tiles combined in the most advanced packaging technology ever created by Intel. They call them "magical tiles", and they bring logic, memory, and I/O controllers, all built using different semiconductor nodes.

Mr. Koduri also pointed out that the chip was born after only two years after the concept, which is an awesome achievement, given that the research of the new silicon takes years. The chip will be the heart of many systems that require massive computational power, especially the ones like AI. Claiming to have the capability to perform quadrillion floating-point operations per second (one petaflop), the chip will be a true monster. So far we don't know other details like the floating-point precision it runs at with one petaflop or the total power consumption of those 47 tiles, so we have to wait for more details.
More pictures follow.

Intel to Launch 3rd Gen Intel Xeon Scalable Portfolio on April 6

Intel today revealed that it will launch its 3rd Generation Xeon Scalable processor series at an online event titled "How Wonderful Gets Done 2021," on April 6, 2021. This will be one of the first major media events headed by Intel's new CEO, Pat Gelsinger. Besides the processor launch, Intel is expected to detail many of its advances in the enterprise space, particularly in the areas of 5G infrastructure rollout, edge computing, and AI/HPC. The 3rd Gen Xeon Scalable processors are based on the new 10 nm "Ice Lake-SP" silicon, heralding the company's first CPU core IPC gain in the server space since 2015. The processors also introduce new I/O capabilities, such as PCI-Express 4.0.
Return to Keyword Browsing
Copyright © 2004-2021 www.techpowerup.com. All rights reserved.
All trademarks used are properties of their respective owners.