News Posts matching "HPC"

Return to Keyword Browsing

NVIDIA Launches Maxed-out GP102 Based Quadro P6000

Late last week, NVIDIA announced the TITAN X Pascal, its fastest consumer graphics offering targeted at gamers and PC enthusiasts. The reign of TITAN X Pascal being the fastest single-GPU graphics card could be short-lived, as NVIDIA announced a Quadro product based on the same "GP102" silicon, which maxes out its on-die resources. The new Quadro P6000, announced at SIGGRAPH alongside the GP104-based Quadro P5000, features all 3,840 CUDA cores physically present on the chip.

Besides 3,840 CUDA cores, the P6000 features a maximum FP32 (single-precision floating point) performance of up to 12 TFLOP/s. The card also features 24 GB of GDDR5X memory, across the chip's 384-bit wide memory interface. The Quadro P5000, on the other hand, features 2,560 CUDA cores, up to 8.9 TFLOP/s FP32 performance, and 16 GB of GDDR5X memory across a 256-bit wide memory interface. It's interesting to note that neither cards feature full FP64 (double-precision) machinery, and that is cleverly relegated to NVIDIA's HPC product line, the Tesla P-series.

NVIDIA Accelerates Volta to May 2017?

Following the surprise TITAN X Pascal launch slated for 2nd August, it looks like NVIDIA product development cycle is running on steroids, with reports emerging of the company accelerating its next-generation "Volta" architecture debut to May 2017, along the sidelines of next year's GTC. The architecture was originally scheduled to make its debut in 2018.

Much like "Pascal," the "Volta" architecture could first debut with HPC products, before moving on to the consumer graphics segment. NVIDIA could also retain the 16 nm FinFET+ process at TSMC for Volta. Stacked on-package memory such as HBM2 could be more readily available by 2017, and could hit sizable volumes towards the end of the year, making it ripe for implementation in high-volume consumer products.


Source: WCCFTech

NVIDIA Announces a PCI-Express Variant of its Tesla P100 HPC Accelerator

NVIDIA announced a PCI-Express add-on card variant of its Tesla P100 HPC accelerator, at the 2016 International Supercomputing Conference, held in Frankfurt, Germany. The card is about 30 cm long, 2-slot thick, and of standard height, and is designed for PCIe multi-slot servers. The company had introduced the Tesla P100 earlier this year in April, with a dense mezzanine form-factor variant for servers with NVLink.

The PCIe variant of the P100 offers slightly lower performance than the NVLink variant, because of lower clock speeds, although the core-configuration of the GP100 silicon remains unchanged. It offers FP64 (double-precision floating-point) performance of 4.70 TFLOP/s, FP32 (single-precision) performance of 9.30 TFLOP/s, and FP16 performance of 18.7 TFLOP/s, compared to the NVLink variant's 5.3 TFLOP/s, 10.6 TFLOP/s, and 21 TFLOP/s, respectively. The card comes in two sub-variants based on memory, there's a 16 GB variant with 720 GB/s memory bandwidth and 4 MB L3 cache, and a 12 GB variant with 548 GB/s and 3 MB L3 cache. Both sub-variants feature 3,584 CUDA cores based on the "Pascal" architecture, and core clock speed of 1300 MHz.

NVIDIA "Pascal" GP100 Silicon Detailed

The upcoming "Pascal" GPU architecture from NVIDIA is shaping up to be a pixel-crunching monstrosity. Introduced as more of a number-cruncher in its Tesla P100 unveil at GTC 2016, we got our hands on the block diagram of the "GP100" silicon which drives it. To begin with, the GP100 is a multi-chip module, much like AMD's "Fiji," consisting of a large GPU die, four memory-stacks, and silicon wafer (interposer) acting as substrate for the GPU and memory stacks, letting NVIDIA drive microscopic wires between the two. The GP100 features a 4096-bit wide HBM2 memory interface, with typical memory bandwidths of up to 1 TB/s. On the P100, the memory ticks at 720 GB/s.

At its most top-level hierarchy, the GP100 is structured much like other NVIDIA GPUs, with the exception of two key interfaces - bus and memory. A PCI-Express gen 3.0 x16 host interface connects the GPU to your system, the GigaThread Engine distributes workload between six graphics processing clusters (GPCs). Eight memory controllers make up the 4096-bit wide HBM2 memory interface, and a new "High-speed Hub" component, wires out four NVLink ports. At this point it's not known if each port has a throughput of 80 GB/s (per-direction), or all four ports put together.

AMD FirePro S9300 x2 Server GPU Helps Create Largest Map of the Universe

AMD today announced that researchers at the Canadian Hydrogen Intensity Mapping Experiment (CHIME) will harness the AMD FirePro S9300 x2 Server GPU, the world's fastest single-precision GPU accelerator, to analyze extraordinary amounts of data to help create a new, very detailed 3D map of the largest volume of the Universe ever observed. Rather than using traditional dish-shaped telescopes, CHIME consists of four 100-metre-long cylindrical reflectors which cover an area larger than five professional hockey rinks and gathers signals for the critical computational analyses supplied by the AMD FirePro S9300 x2 GPU cluster.

The CHIME project was created to investigate the discovery that the expansion of the Universe is speeding up rather than slowing down. Using consumer technologies similar to those found in common radio receivers, the telescope collects radio waves that have travelled through space for up to 11 billion years and feeds them into a massive supercomputer powered by a series of AMD FirePro S9300 x2 GPUs. The intense number crunching required to map the Universe's expansion in this way was previously cost-prohibitive, but is now being enabled by AMD FirePro GPUs. The anticipated results will help create a highly-detailed map showing the intensity of the hydrogen radiation from billions of galaxies, which will help scientists understand the accelerating expansion of the Universe.

SK Hynix to Ship 4GB HBM2 Stacks by Q3-2016

Korean DRAM and NAND flash giant SK Hynix will be ready to ship its 4 GB stacked second generation high-bandwidth memory (HBM2) chips from Q3, 2016. These packages will be made up of four 1 GB dies, with a bandwidth-per-pin of 1 Gbps, 1.6 Gbps, and 2 Gbps, working out to per-stack bandwidths of 128 GB/s, 204 GB/s, and 256 GB/s, respectively.

These chips will target applications such as graphics cards, network infrastructure, HPC, and servers. The company is also designing 8 GB stacks, made up of eight 1 GB dies. These stacks will be targeted at HPC and server applications. The company is also offering cost-effective 2 GB, 2-die stacks, for graphics cards. The cost-effective 2 GB, 2-die stacks could prove particularly important for the standard's competition against GDDR5X, particularly in mid-range and performance-segment graphics cards.

Source: Golem.de

AMD GPUOpen Fuels Seismic Supercomputing Efforts for Geoscience Leader CGG

AMD today announced that CGG, a pioneering global geophysical services and equipment company, has deployed AMD FirePro S9150 server GPUs to accelerate its geoscience oil and gas research efforts, harnessing more than 1 PetaFLOPS of GPU processing power. Employing AMD's HPC GPU Computing software tools available on GPUOpen.com, CGG rapidly converted its in-house NVIDIA CUDA code to OpenCL for seismic data processing running on an AMD FirePro S9150 GPU production cluster, enabling fast, cost-effective GPU-powered research.

"The installation of the large AMD GPU production cluster at CGG is a great example of how AMD's technology prowess in both HPC GPU hardware and in open source software tools combined to deliver incredible results," said Brian Reeves, senior director, product management, AMD Professional Graphics. "Energy research is a demanding and time-intensive task that stands to realize significant benefits from the use of GPU computation. GPUOpen software tools coupled with AMD FirePro S-series hardware enables an efficient and cost-effective solution for rapid data processing that can drive tremendous competitive advantage."

Samsung Begins Mass-Producing 4-Gigabyte HBM2 Memory Stacks

Samsung Electronics Co., Ltd., announced today that it has begun mass producing the industry's first 4-gigabyte (GB) DRAM package based on the second-generation High Bandwidth Memory (HBM2) interface, for use in high performance computing (HPC), advanced graphics and network systems, as well as enterprise servers. Samsung's new HBM solution will offer unprecedented DRAM performance - more than seven times faster than the current DRAM performance limit, allowing faster responsiveness for high-end computing tasks including parallel computing, graphics rendering and machine learning.

"By mass producing next-generation HBM2 DRAM, we can contribute much more to the rapid adoption of next-generation HPC systems by global IT companies," said Sewon Chun, senior vice president, Memory Marketing, Samsung Electronics. "Also, in using our 3D memory technology here, we can more proactively cope with the multifaceted needs of global IT, while at the same time strengthening the foundation for future growth of the DRAM market."

The newly introduced 4GB HBM2 DRAM, which uses Samsung's most efficient 20-nanometer process technology and advanced HBM chip design, satisfies the need for high performance, energy efficiency, reliability and small dimensions making it well suited for next-generation HPC systems and graphics cards.

Seagate Engineers Tiered Archive System for HPC Storage

Seagate Technology plc. today introduced ClusterStor A200, a first-of-its-kind tiered archive storage system for high performance computing (HPC). The A200 helps reduce storage and operational costs by up to 50 percent compared to tier-one storage platforms.

The A200 was designed to complement the rest of Seagate's ClusterStor family of scale-out storage systems. It allows customers to non-disruptively migrate designated data off of the performance-optimized, primary storage tiers while keeping it online for fast retrieval. This avoids a common problem in shared HPC environments where the organization is forced to choose between having all of the data available to make the best analysis versus the time required to retrieve data from tape. Performance of the primary storage is often improved by migrating data and freeing up space for more efficient data layout. The pre-configured ClusterStor A200 solution includes an automatic policy-driven hierarchical storage management (HSM) system and near limitless scale-out capacity.

NVIDIA GPUs to Accelerate Microsoft Azure

NVIDIA today announced that Microsoft will offer NVIDIA GPU-enabled professional graphics applications and accelerated computing capabilities to customers worldwide through its cloud platform, Microsoft Azure. Deploying the latest version of NVIDIA GRID in its new N-Series virtual machine offering, Azure is the first cloud computing platform to provide NVIDIA GRID 2.0 virtualized graphics for enterprise customers.

For the first time, businesses will have the ability to deploy NVIDIA Quadro-grade professional graphics applications and accelerated computing on-premises, in the cloud through Azure, or via a hybrid of the two using both Windows and Linux virtual machines. Azure will also offer customers supercomputing-class performance, with the addition of the NVIDIA Tesla Accelerated Computing Platform's flagship Tesla K80 GPU accelerators, for the most computationally demanding data center and high performance computing (HPC) applications.

AMD and Dell Support Bioinformatics Studies at University of Warsaw in Poland

AMD today unveiled innovation in heterogeneous high performance computing (HPC) by delivering more than 1.5 petaFLOPS of AMD FirePro S9150 server GPU performance for the Next Generation Sequencing Centre (NGSC) at the Centre of New Technologies, University of Warsaw in support of bioinformatics research related to next generation sequencing (NGS) studies. The new ORION cluster features 150 Dell PowerEdge R730 servers with two AMD FirePro S9150 server GPUs, for a total GPU peak of 1.52 petaFLOPS single precision and 0.76 petaFLOPS double precision performance. The energy-efficient cluster enables high speed and efficient calculations for genomic data, applicable to a range of genomics and bioinformatics studies, using a fast and power efficient OpenCL implementation for research applications.

"We're committed to building our HPC leadership position in the industry as a foremost provider of computing applications, tools and technologies," said Sean Burke, corporate vice-president and general manager, AMD Professional Graphics. "This installation reaffirms AMD's leading role in HPC with the implementation of the AMD FirePro S9150 server GPUs in this 1.5 petaFLOPS supercomputer cluster. AMD and Dell are enabling OpenCL applications for critical science research usage for this cluster. AMD is proud to collaborate with Dell and NGSC to support such important life science and computer science research."

IBM, NVIDIA and Mellanox Launch Design Center for Big Data and HPC

IBM, in collaboration with NVIDIA and Mellanox, today announced the establishment of a POWER Acceleration and Design Center in Montpellier, France to advance the development of data-intensive research, industrial, and commercial applications. Born out of the collaborative spirit fostered by the OpenPOWER Foundation - a community co-founded in part by IBM, NVIDIA and Mellanox supporting open development on top of the POWER architecture - the new Center provides commercial and open-source software developers with technical assistance to enable them to develop high performance computing (HPC) applications.

Technical experts from IBM, NVIDIA and Mellanox will help developers take advantage of OpenPOWER systems leveraging IBM's open and licensable POWER architecture with the NVIDIA Tesla Accelerated Computing Platform and Mellanox InfiniBand networking solutions. These are the class of systems developed collaboratively with the U.S. Department of Energy for the next generation Sierra and Summit supercomputers and to be used by the United Kingdom's Science and Technology Facilities Council's Hartree Centre for big data research.

Seagate Announces ClusterStor Hadoop Workflow Accelerator

Seagate Technology plc, a world leader in storage solutions, today announced availability of the ClusterStor Hadoop Workflow Accelerator a new solution providing the tools, services, and support for High Performance Computing (HPC) customers who need the best performing storage systems for Big Data Analytics. The Hadoop Workflow Accelerator is a set of Hadoop optimization tools, services and support that leverages and enhances the performance of ClusterStor, the market leading scale-out storage system, designed for Big Data analysis.

Computationally intensive High Performance Data Analytics (HPDA) environments will benefit from significant reductions in data transfer time with the Hadoop Workflow Accelerator. This solution also includes the Hadoop on Lustre Connector, which allows both Hadoop and HPC Lustre clusters to use exactly the same data without having to move the data between file systems or storage devices.

Intel Reveals Details for Future HPC System Building Blocks

Intel Corporation todayannounced several new and enhanced technologies bolstering its leadership in high-performance computing (HPC). These include disclosure of the future generation Intel Xeon Phi processor, code-named Knights Hill, and new architectural and performance details for Intel Omni-Path Architecture, a new high-speed interconnect technology optimized for HPC deployments.

Intel also announced new software releases and collaborative efforts designed to make it easier for the HPC community to extract the full performance potential from current and future Intel industry-standard hardware. Together, these new HPC building blocks and industry collaborations will help to address the dual challenges of extreme scalability and mainstream use of HPC while providing the foundation for a cost-effective path to exascale computing.

Cray Launches New High Density Cluster Packed With NVIDIA GPU Accelerators

Global supercomputer leader Cray Inc. today announced the launch of the Cray CS-Storm -- a high-density accelerator compute system based on the Cray CS300 cluster supercomputer. Featuring up to eight NVIDIA Tesla GPU accelerators and a peak performance of more than 11 teraflops per node, the Cray CS-Storm system is one of the most powerful single-node cluster architectures available today.

Designed to support highly scalable applications in areas such as energy, life sciences, financial services, and geospatial intelligence, the Cray CS-Storm provides exceptional performance, energy efficiency and reliability within a small footprint. The system leverages the supercomputing architecture of the air-cooled Cray CS300 system, and includes the Cray Advanced Cluster Engine cluster management software, the complete Cray Programming Environment on CS, and NVIDIA Tesla K40 GPU accelerators. The Cray CS-Storm system includes Intel Xeon E5 2600 v2 processors.

AMD Introduces the FirePro S9150 Server Card

AMD today announced the AMD FirePro S9150 server card -- the most powerful server Graphics Processing Unit (GPU) ever built for High Performance Computing. Based on the AMD Graphics Core Next (GCN) architecture, the first AMD architecture designed specifically with compute workloads in mind, the AMD FirePro S9150 server card is the first server card to support enhanced double precision and break the 2.0 TFLOPS double precision barrier. With 16 GB of GDDR5 memory -- 33 percent more than the competition -- and maximum power consumption of 235 watts, AMD FirePro S9150 server GPUs provide supercomputers with massive compute performance while maximizing available power budgets.

"Today's supercomputers feature an increasing mix of GPUs, CPUs and co-processors to achieve great performance, and many of them are being implemented in an environmentally responsible manner to help reduce power and water consumption," said David Cummings, senior director and general manager, professional graphics, AMD. "Designed for large scale multi-GPU support and unmatched compute performance, AMD FirePro S9150 ushers in a new era of supercomputing. Its memory configuration, compute capabilities and performance per watt are unmatched in its class, and can help take supercomputers to the next level of performance and energy efficiency."

Puget Systems Launches New Quad CPU Workstations

Puget Systems has been providing quad socket workstations for years now. Today, we refresh that product with a new duo of quad socket workstations that offer even more capacity, better cooling, and quieter operation. The new Peak Quad CPU workstations come in both Intel and AMD varieties. Our intention with this refresh is to take some of the highest performance workstation configurations available today and make them something you can put in your lab or office. Most workstations and servers of this caliber come with a prohibitive noise level but the Peak line of workstations solves this problem while still providing excellent cooling and long component lifespan.

In addition to supporting four CPUs, these workstations also support other high performance options such as large SSD arrays and accelerator cards including Intel Xeon Phi and NVIDIA Tesla. Most importantly, we have designed these workstations to be flexible. With systems at this level, it is typical for us to plan, design, implement and test a custom solution for each customer and use case. If you are unsure whether Peak is right for you, just ask! We have dedicated staff on hand for HPC, parallel and cluster computing.

Eurotech, AppliedMicro and NVIDIA Develop New HPC System Architecture

Eurotech, a leading provider of embedded and supercomputing technologies, has teamed up with Applied Micro Circuits Corporation and NVIDIA to develop a new, original high performance computing (HPC) system architecture that combines extreme density and best-in-class energy efficiency. The new architecture is based on an innovative highly modular and scalable packaging concept.

Eurotech, which has years of significant experience in designing and manufacturing original HPC systems, has successfully developed an HPC systems architecture that optimizes the benefits of greater density, as well as the energy efficiency of ARM processors and high-performance GPU accelerators.

Micron Collaborates With Intel to Enhance "Knights Landing"

Micron Technology, Inc. (Nasdaq:MU), one of the world's leading providers of advanced semiconductor solutions, today announced an ongoing collaboration with Intel to deliver an on-package memory solution for Intel's next-generation Xeon Phi processor, codenamed Knights Landing. The memory solution is the result of a long-term effort between the two companies to break down the memory wall, leveraging the fundamental DRAM and stacking technologies also found in Micron's Hybrid Memory Cube products.

"The ecosystem is changing and the importance of scalable on-package memory and memory bandwidth is now coming to light," said Chirag Dekate, Research Manager at IDC. "Memory is at the heart of the solution space which will benefit both big compute and big data. This announcement is a clear validation of how Micron is advancing the role and impact of memory on systems and the value that 3D memory can deliver."

Intel Details Its Next-Gen Xeon Phi Processor

Intel Corporation today announced new details for its next-generation Intel Xeon Phi processors, code-named Knights Landing, which promise to extend the benefits of code modernization investments being made for current generation products. These include a new high-speed fabric that will be integrated on-package and high-bandwidth, on-package memory that combined, promise to accelerate the rate of scientific discovery. Currently memory and fabrics are available as discrete components in servers limiting the performance and density of supercomputers.

The new interconnect technology, called Intel Omni Scale Fabric, is designed to address the requirements of the next generations of high-performance computing (HPC). Intel Omni Scale Fabric will be integrated in the next generation of Intel Xeon Phi processors as well as future general-purpose Intel Xeon processors. This integration along with the fabric's HPC-optimized architecture is designed to address the performance, scalability, reliability, power and density requirements of future HPC deployments. It is designed to balance price and performance for entry-level through extreme-scale deployments.

NVIDIA GPUs Open the Door to ARM64 Entry Into High Performance Computing

NVIDIA today announced that multiple server vendors are leveraging the performance of NVIDIA GPU accelerators to launch the world's first 64-bit ARM development systems for high performance computing (HPC).

ARM64 server processors were primarily designed for micro-servers and web servers because of their extreme energy efficiency. Now, they can tackle HPC-class workloads when paired with GPU accelerators using the NVIDIA CUDA 6.5 parallel programming platform, which supports 64-bit ARM processors.

ASUS Unveils the ESC4000 G2S Series HPC GPU Servers

ASUS today announced ESC4000 G2S, a new 2U-sized server series based on the dual Intel Xeon E5-2600 v2 processor platform and designed for use in environments that demand high-density GPU/coprocessor servers.

The new ESC4000 G2S series servers feature a highly optimized thermal design, six hot-swappable 2.5-inch SATA drive bays and nine PCI Express 3.0 (PCIe 3.0) x16 expansion slots. The innovative and thoughtful design delivers high-density computing power, easy scalability and exceptional energy efficiency, making ESC4000 G2S series servers the ideal choice for applications in the high-performance computing (HPC) fields of life and medical sciences, engineering science, financial modeling and virtualization.

Intel Brings Supercomputing Horsepower to Big Data Analytics

Intel Corporation unveiled innovations in HPC and announced new software tools that will help propel businesses and researchers to generate greater insights from their data and solve their most vital business and scientific challenges.

"In the last decade, the high-performance computing community has created a vision of a parallel universe where the most vexing problems of society, industry, government and research are solved through modernized applications," said Raj Hazra, Intel vice president and general manager of the Technical Computing Group. "Intel technology has helped HPC evolve from a technology reserved for an elite few to an essential and broadly available tool for discovery. The solutions we enable for ecosystem partners for the second half of this decade will drive the next level of insight from HPC. Innovations will include scale through standards, performance through application modernization, efficiency through integration and innovation through customized solutions."

AMD to Research Interconnect Architectures for High-Performance Computing

AMD today announced that it was selected for an award of $3.1 million for a research project associated with the U.S. Department of Energy (DOE) Extreme-Scale Computing Research and Development Program, known as "DesignForward." The DOE award is an expansion of work started as part of another two-year award AMD received in 2012 called "FastForward." The FastForward award aims to accelerate the research and development of processor and memory technologies needed to support extreme-scale computing. The DesignForward award supports the research of the interconnect architectures and technologies needed to support the data transfer capabilities in extreme-scale computing environments.

DesignForward is a jointly funded collaboration between the DOE Office of Science and the U.S. National Nuclear Security Administration (NNSA) to accelerate the research and development of critical technologies needed for extreme-scale computing, on the path toward Exascale computing. Exascale supercomputers are expected to be capable of performing computation hundreds of times faster than today's fastest computers, with only slightly higher power utilization.

Cray Adds NVIDIA Tesla K40 to Its Complete Line of Supercomputing Systems

Global supercomputer leader Cray Inc. today announced the Cray CS300 line of cluster supercomputers and the Cray XC30 supercomputers are now available with the NVIDIA Tesla K40 GPU accelerators. Designed to solve the most demanding supercomputing challenges, the NVIDIA Tesla K40 provides 40 percent higher peak performance than its predecessor, the Tesla K20X GPU.

"The addition of the NVIDIA K40 GPUs furthers our vision for Adaptive Supercomputing, which provides outstanding performance with a computing architecture that accommodates powerful CPUs and highly-advanced accelerators from leading technology companies like NVIDIA," said Barry Bolding, vice president of marketing at Cray. "We have proven that acceleration can be productive at high scalability with Cray systems such as 'Titan', 'Blue Waters', and most recently with the delivery of a Cray XC30 system at the Swiss National Supercomputing Centre (CSCS). Together with Cray's latest OpenACC 2.0 compiler, the new NVIDIA K40 GPUs can process larger datasets, reach higher levels of acceleration and provide more efficient compute performance, and we are pleased these features are now available to customers across our complete portfolio of supercomputing solutions."
Return to Keyword Browsing