News Posts matching "HPC"

Return to Keyword Browsing

Samsung Increases Production of 8 GB HBM2 Memory

Samsung Electronics Co., Ltd., the world leader in advanced memory technology, today announced that it is increasing the production volume of its 8-gigabyte (GB) High Bandwidth Memory-2 (HBM2) to meet growing market needs across a wide range of applications including artificial intelligence, HPC (high-performance computing), advanced graphics, network systems and enterprise servers.

"By increasing production of the industry's only 8GB HBM2 solution now available, we are aiming to ensure that global IT system manufacturers have sufficient supply for timely development of new and upgraded systems," said Jaesoo Han, executive vice president, Memory Sales & Marketing team at Samsung Electronics. "We will continue to deliver more advanced HBM2 line-ups, while closely cooperating with our global IT customers."

GIGABYTE Releases First Wave Of Products Based On Skylake Purley Architecture

GIGABYTE today announced its latest generation of servers based on Intel's Skylake Purley architecture. This new generation brings a wealth of new options in scalability - across compute, network and storage - to deliver solutions for any application, from the enterprise to the data center to HPC. (Jump ahead to system introductions).

This server series adopts Intel's new product family - officially named the 'Intel Xeon Scalable family' and utilizes its ability to meet the increasingly diverse requirements of the industry, from entry-level HPC to large scale clusters.. The major development in this platform is around the improved features and functionality at both the host and fabric levels. These enable performance improvements - both natively on chip and for future extensibility through compute, network and storage peripherals. In practical terms, these new CPUs will offer up to 28 cores, and 48 PCIe lanes per socket.

NVIDIA Announces the Tesla V100 PCI-Express HPC Accelerator

NVIDIA formally announced the PCI-Express add-on card version of its flagship Tesla V100 HPC accelerator, based on its next-generation "Volta" GPU architecture. Based on the advanced 12 nm "GV100" silicon, the GPU is a multi-chip module with a silicon substrate and four HBM2 memory stacks. It features a total of 5,120 CUDA cores, 640 Tensor cores (specialized CUDA cores which accelerate neural-net building), GPU clock speeds of around 1370 MHz, and a 4096-bit wide HBM2 memory interface, with 900 GB/s memory bandwidth. The 815 mm² GPU has a gargantuan transistor-count of 21 billion. NVIDIA is taking institutional orders for the V100 PCIe, and the card will be available a little later this year. HPE will develop three HPC rigs with the cards pre-installed.

Could This be the NVIDIA TITAN Volta?

NVIDIA, which unveiled its faster "Volta" GPU architecture at its 2017 Graphics Technology Conference (GTC), beginning with the HPC product Tesla V100, is closer to launching the consumer graphics variant, the TITAN Volta. A curious-looking graphics card image with "TITAN" markings surfaced on Reddit. One could discount the pic for being that of a well-made cooler mod, until you take a peak at the PCB. It appears to lack SLI fingers where you'd expect them to be, and instead has NVLink fingers in positions found on the PCIe add-in card variant of the Tesla P100 HPC accelerator.

You might think "alright, it's not a fancy TITAN X Pascal cooler mod, but it could be a P100 with a cooler mod," until you notice the power connectors - it has two power inputs on top of the card (where they're typically found on NVIDIA's consumer graphics cards), and not the rear portion of the card (where the P100 has it, and where they're typically found on Tesla and Quadro series products). Whoever pulled this off has done an excellent job either way - of scoring a potential TITAN Volta sample, or modding whatever card to look very plausible of being a TITAN Volta.
Sources: Reddit, VideoCardz

NVIDIA's Volta Reportedly Poised for Anticipated, Early Q3 2017 Launch

According to a report from Chinese website MyDrivers, NVIDIA is looking to spruce things up on its line-up with a much earlier than expected Q3 Volta Launch. Remember that Volta was expected, according to NVIDIA's own road-maps, to launch around early 2018. The report indicates that NVIDIA's Volta products - apparently to be marketed as the GeForce 20-series - will see an early launch due to market demands, and NVIDIA's intention to further increase pricing of its products through a new-generation launch.

These stand, for now, as only rumors (and not the first time they've surfaced at that), but paint a pretty interesting picture, nonetheless. Like Intel with its Coffee Lake series, pushing a product launch to earlier than expected has consequences: production, logistics, infrastructure, product roadmaps, and stock of existing previous-generation products must all be taken into account. And with NVIDIA just recently having introduced its performance-champions GTX 1080 Ti and Titan Xp graphics cards, all of this seems a trigger pull too early - especially when taking into account the competition landscape in high-performance graphics, which is akin to a single green-colored banner poised atop the Himalayas. And NVIDIA must not forget the fact that AMD could be pulling a black swan off its engineering department with Vega, like it did with its Ryzen series of CPUs.

NVIDIA, Microsoft Launch Industry-Standard Hyperscale GPU Accelerator

NVIDIA with Microsoft today unveiled blueprints for a new hyperscale GPU accelerator to drive AI cloud computing. Providing hyperscale data centers with a fast, flexible path for AI, the new HGX-1 hyperscale GPU accelerator is an open-source design released in conjunction with Microsoft's Project Olympus.

HGX-1 does for cloud-based AI workloads what ATX -- Advanced Technology eXtended -- did for PC motherboards when it was introduced more than two decades ago. It establishes an industry standard that can be rapidly and efficiently embraced to help meet surging market demand. The new architecture is designed to meet the exploding demand for AI computing in the cloud -- in fields such as autonomous driving, personalized healthcare, superhuman voice recognition, data and video analytics, and molecular simulations.

NVIDIA Announces DGX SaturnV: The World's Most Efficient Supercomputer

This week NVIDIA announced their latest innovation to the HPC landscape, the DGX SaturnV. Destined for the likes of universities and companies with a need for deep learning capabilities, the DGX SaturnV sets a new benchmark for energy efficiency in High Performance Computing. While not managing the title of the fastest supercomputer this year, the SaturnV takes a respectable placing of 28th in the top 500 list, while promising much lower running costs for performance on tap.

Capable of delivering 9.46 GFLOPS of computational speed per Watt of energy consumed, it bests last years best effort of 6.67 GFLOPS/W by 42%. The SaturnV is comprised of 125 DGX-1 deep learning systems, and each DGX-1 contains no less than eight Tesla P100 cards. Where a single GTX1080 can churn out 138 GFLOPS of FP16 calculations, a single Telsa P100 can deliver a massive 21.2 TFLOPS. The singular DGX-1 units are already in the field, including being used by NVIDIA themselves.

NVIDIA Tesla P100 Available on Google Cloud Platform

NVIDIA announced that its flagship GPGPU accelerator, the Tesla P100, will be available through Google Cloud Platform. The company's Tesla K80 accelerator will also be offered. The Google Cloud Platform allows customers to perform specific computing tasks at an infinitesimally lower cost than having to rent hardware in-situ or having to buy it; by offloading your computing tasks to offsite data-centers. IT professionals can build and deploy servers, HPC farms, or even supercomputers, of all shapes and sizes within hours of placing an order online with Google.

The Tesla P100 is a GPGPU with the most powerful GPU in existence - the NVIDIA GP100 "Pascal," featuring 3,584 CUDA cores, up to 16 GB of HBM2 memory, and NVLink high-bandwidth interconnect support. The other high-end GPU accelerators on offer by Google are the Tesla K80, based on a pair of GK210 "Kepler" GPUs, and the AMD FirePro S9300 X2, based on a pair of "Fiji" GPUs.

AMD Radeon Technology Will Be Available on Google Cloud Platform in 2017

At SC16, AMD announced that Radeon GPU technology will be available to Google Cloud Platform users worldwide. Starting in 2017, Google will use AMD's fastest available single-precision dual GPU compute accelerators, Radeon-based AMD FirePro S9300 x2 Server GPUs, to help accelerate Google Compute Engine and Google Cloud Machine Learning services. AMD FirePro S9300 x2 GPUs can handle highly parallel calculations, including complex medical and financial simulations, seismic and subsurface exploration, machine learning, video rendering and transcoding, and scientific analysis. Google Cloud Platform will make the AMD GPU resources available for all their users around the world.

"Graphics processors represent the best combination of performance and programmability for existing and emerging big data applications," said Raja Koduri, senior vice president and chief architect, Radeon Technologies Group, AMD. "The adoption of AMD GPU technology in Google Cloud Platform is a validation of the progress AMD has made in GPU hardware and our Radeon Open Compute Platform, which is the only fully open source hyperscale GPU compute platform in the world today. We expect that our momentum in GPU computing will continue to accelerate with future hardware and software releases and advances in the ecosystem of middleware and libraries."

TYAN Displays HPC Platforms for Enterprises and Data Centers

TYAN, an industry-leading server platform design manufacturer and subsidiary of MiTAC Computing Technology Corporation, is showcasing a wide range of HPC server platforms optimized for enterprise, storage and data center applications at SC16 this week in Salt Lake City's Salt Palace Convention Center.

TYAN's comprehensive HPC platforms span a wide range of hardware specifications. The Intel Xeon E7-based, 4U quad-socket FT76-B7922 offers a memory capacity of 6TB and supports up to 4x Intel Xeon Phi coprocessors for the most demanding HPC users; the Intel Xeon E5-based, 4U dual-socket FT77C-B7079 supports up to 8x Intel Xeon Phi coprocessors for highly parallelized application deployment, the 2U dual-socket TA80-B7071 supports up to 4x Intel Xeon Phi coprocessors for large-scale production deployment in various high performance computing segments; and the 1U dual-socket GA80-B7081 supports up to 3x Intel Xeon Phi coprocessors for ISVs, universities, and small businesses looking for parallelized application development or proof of concept solution deployment.

AMD Announces ROCm Initiative - High-Performance Computing & Open-Standards

AMD on Monday announced their ROCm initiative. Introduced by AMD's Gregory Stoner, Senior Director for the Radeon Open Compute Initiative, ROCm stands for Radeon Open Compute platforM. This open-standard, high-performance, Hyper Scale computing platform stands on the shoulders of AMD's technological expertise and accomplishments, with cards like the Radeon R9 Nano achieving as much as 46 GFLOPS of peak single-precision performance per Watt.

The natural evolution of AMD's Boltzmann Initiative, ROCm grants developers and coders a platform which allows the leveraging of AMD's GPU solutions through a variety of popular programming languages, such as OpenCL, CUDA, ISO C++ and Python. AMD knows that the hardware is but a single piece in an ecosystem, and that having it without any supporting software is a recipe for failure. As such, AMD's ROCm stands as AMD's push towards HPC by leveraging both its hardware, as well as the support for open-standards and the conversion of otherwise proprietary code.

NVIDIA Launches Maxed-out GP102 Based Quadro P6000

Late last week, NVIDIA announced the TITAN X Pascal, its fastest consumer graphics offering targeted at gamers and PC enthusiasts. The reign of TITAN X Pascal being the fastest single-GPU graphics card could be short-lived, as NVIDIA announced a Quadro product based on the same "GP102" silicon, which maxes out its on-die resources. The new Quadro P6000, announced at SIGGRAPH alongside the GP104-based Quadro P5000, features all 3,840 CUDA cores physically present on the chip.

Besides 3,840 CUDA cores, the P6000 features a maximum FP32 (single-precision floating point) performance of up to 12 TFLOP/s. The card also features 24 GB of GDDR5X memory, across the chip's 384-bit wide memory interface. The Quadro P5000, on the other hand, features 2,560 CUDA cores, up to 8.9 TFLOP/s FP32 performance, and 16 GB of GDDR5X memory across a 256-bit wide memory interface. It's interesting to note that neither cards feature full FP64 (double-precision) machinery, and that is cleverly relegated to NVIDIA's HPC product line, the Tesla P-series.

NVIDIA Accelerates Volta to May 2017?

Following the surprise TITAN X Pascal launch slated for 2nd August, it looks like NVIDIA product development cycle is running on steroids, with reports emerging of the company accelerating its next-generation "Volta" architecture debut to May 2017, along the sidelines of next year's GTC. The architecture was originally scheduled to make its debut in 2018.

Much like "Pascal," the "Volta" architecture could first debut with HPC products, before moving on to the consumer graphics segment. NVIDIA could also retain the 16 nm FinFET+ process at TSMC for Volta. Stacked on-package memory such as HBM2 could be more readily available by 2017, and could hit sizable volumes towards the end of the year, making it ripe for implementation in high-volume consumer products.

Source: WCCFTech

NVIDIA Announces a PCI-Express Variant of its Tesla P100 HPC Accelerator

NVIDIA announced a PCI-Express add-on card variant of its Tesla P100 HPC accelerator, at the 2016 International Supercomputing Conference, held in Frankfurt, Germany. The card is about 30 cm long, 2-slot thick, and of standard height, and is designed for PCIe multi-slot servers. The company had introduced the Tesla P100 earlier this year in April, with a dense mezzanine form-factor variant for servers with NVLink.

The PCIe variant of the P100 offers slightly lower performance than the NVLink variant, because of lower clock speeds, although the core-configuration of the GP100 silicon remains unchanged. It offers FP64 (double-precision floating-point) performance of 4.70 TFLOP/s, FP32 (single-precision) performance of 9.30 TFLOP/s, and FP16 performance of 18.7 TFLOP/s, compared to the NVLink variant's 5.3 TFLOP/s, 10.6 TFLOP/s, and 21 TFLOP/s, respectively. The card comes in two sub-variants based on memory, there's a 16 GB variant with 720 GB/s memory bandwidth and 4 MB L3 cache, and a 12 GB variant with 548 GB/s and 3 MB L3 cache. Both sub-variants feature 3,584 CUDA cores based on the "Pascal" architecture, and core clock speed of 1300 MHz.

NVIDIA "Pascal" GP100 Silicon Detailed

The upcoming "Pascal" GPU architecture from NVIDIA is shaping up to be a pixel-crunching monstrosity. Introduced as more of a number-cruncher in its Tesla P100 unveil at GTC 2016, we got our hands on the block diagram of the "GP100" silicon which drives it. To begin with, the GP100 is a multi-chip module, much like AMD's "Fiji," consisting of a large GPU die, four memory-stacks, and silicon wafer (interposer) acting as substrate for the GPU and memory stacks, letting NVIDIA drive microscopic wires between the two. The GP100 features a 4096-bit wide HBM2 memory interface, with typical memory bandwidths of up to 1 TB/s. On the P100, the memory ticks at 720 GB/s.

At its most top-level hierarchy, the GP100 is structured much like other NVIDIA GPUs, with the exception of two key interfaces - bus and memory. A PCI-Express gen 3.0 x16 host interface connects the GPU to your system, the GigaThread Engine distributes workload between six graphics processing clusters (GPCs). Eight memory controllers make up the 4096-bit wide HBM2 memory interface, and a new "High-speed Hub" component, wires out four NVLink ports. At this point it's not known if each port has a throughput of 80 GB/s (per-direction), or all four ports put together.

AMD FirePro S9300 x2 Server GPU Helps Create Largest Map of the Universe

AMD today announced that researchers at the Canadian Hydrogen Intensity Mapping Experiment (CHIME) will harness the AMD FirePro S9300 x2 Server GPU, the world's fastest single-precision GPU accelerator, to analyze extraordinary amounts of data to help create a new, very detailed 3D map of the largest volume of the Universe ever observed. Rather than using traditional dish-shaped telescopes, CHIME consists of four 100-metre-long cylindrical reflectors which cover an area larger than five professional hockey rinks and gathers signals for the critical computational analyses supplied by the AMD FirePro S9300 x2 GPU cluster.

The CHIME project was created to investigate the discovery that the expansion of the Universe is speeding up rather than slowing down. Using consumer technologies similar to those found in common radio receivers, the telescope collects radio waves that have travelled through space for up to 11 billion years and feeds them into a massive supercomputer powered by a series of AMD FirePro S9300 x2 GPUs. The intense number crunching required to map the Universe's expansion in this way was previously cost-prohibitive, but is now being enabled by AMD FirePro GPUs. The anticipated results will help create a highly-detailed map showing the intensity of the hydrogen radiation from billions of galaxies, which will help scientists understand the accelerating expansion of the Universe.

SK Hynix to Ship 4GB HBM2 Stacks by Q3-2016

Korean DRAM and NAND flash giant SK Hynix will be ready to ship its 4 GB stacked second generation high-bandwidth memory (HBM2) chips from Q3, 2016. These packages will be made up of four 1 GB dies, with a bandwidth-per-pin of 1 Gbps, 1.6 Gbps, and 2 Gbps, working out to per-stack bandwidths of 128 GB/s, 204 GB/s, and 256 GB/s, respectively.

These chips will target applications such as graphics cards, network infrastructure, HPC, and servers. The company is also designing 8 GB stacks, made up of eight 1 GB dies. These stacks will be targeted at HPC and server applications. The company is also offering cost-effective 2 GB, 2-die stacks, for graphics cards. The cost-effective 2 GB, 2-die stacks could prove particularly important for the standard's competition against GDDR5X, particularly in mid-range and performance-segment graphics cards.

Source: Golem.de

AMD GPUOpen Fuels Seismic Supercomputing Efforts for Geoscience Leader CGG

AMD today announced that CGG, a pioneering global geophysical services and equipment company, has deployed AMD FirePro S9150 server GPUs to accelerate its geoscience oil and gas research efforts, harnessing more than 1 PetaFLOPS of GPU processing power. Employing AMD's HPC GPU Computing software tools available on GPUOpen.com, CGG rapidly converted its in-house NVIDIA CUDA code to OpenCL for seismic data processing running on an AMD FirePro S9150 GPU production cluster, enabling fast, cost-effective GPU-powered research.

"The installation of the large AMD GPU production cluster at CGG is a great example of how AMD's technology prowess in both HPC GPU hardware and in open source software tools combined to deliver incredible results," said Brian Reeves, senior director, product management, AMD Professional Graphics. "Energy research is a demanding and time-intensive task that stands to realize significant benefits from the use of GPU computation. GPUOpen software tools coupled with AMD FirePro S-series hardware enables an efficient and cost-effective solution for rapid data processing that can drive tremendous competitive advantage."

Samsung Begins Mass-Producing 4-Gigabyte HBM2 Memory Stacks

Samsung Electronics Co., Ltd., announced today that it has begun mass producing the industry's first 4-gigabyte (GB) DRAM package based on the second-generation High Bandwidth Memory (HBM2) interface, for use in high performance computing (HPC), advanced graphics and network systems, as well as enterprise servers. Samsung's new HBM solution will offer unprecedented DRAM performance - more than seven times faster than the current DRAM performance limit, allowing faster responsiveness for high-end computing tasks including parallel computing, graphics rendering and machine learning.

"By mass producing next-generation HBM2 DRAM, we can contribute much more to the rapid adoption of next-generation HPC systems by global IT companies," said Sewon Chun, senior vice president, Memory Marketing, Samsung Electronics. "Also, in using our 3D memory technology here, we can more proactively cope with the multifaceted needs of global IT, while at the same time strengthening the foundation for future growth of the DRAM market."

The newly introduced 4GB HBM2 DRAM, which uses Samsung's most efficient 20-nanometer process technology and advanced HBM chip design, satisfies the need for high performance, energy efficiency, reliability and small dimensions making it well suited for next-generation HPC systems and graphics cards.

Seagate Engineers Tiered Archive System for HPC Storage

Seagate Technology plc. today introduced ClusterStor A200, a first-of-its-kind tiered archive storage system for high performance computing (HPC). The A200 helps reduce storage and operational costs by up to 50 percent compared to tier-one storage platforms.

The A200 was designed to complement the rest of Seagate's ClusterStor family of scale-out storage systems. It allows customers to non-disruptively migrate designated data off of the performance-optimized, primary storage tiers while keeping it online for fast retrieval. This avoids a common problem in shared HPC environments where the organization is forced to choose between having all of the data available to make the best analysis versus the time required to retrieve data from tape. Performance of the primary storage is often improved by migrating data and freeing up space for more efficient data layout. The pre-configured ClusterStor A200 solution includes an automatic policy-driven hierarchical storage management (HSM) system and near limitless scale-out capacity.

NVIDIA GPUs to Accelerate Microsoft Azure

NVIDIA today announced that Microsoft will offer NVIDIA GPU-enabled professional graphics applications and accelerated computing capabilities to customers worldwide through its cloud platform, Microsoft Azure. Deploying the latest version of NVIDIA GRID in its new N-Series virtual machine offering, Azure is the first cloud computing platform to provide NVIDIA GRID 2.0 virtualized graphics for enterprise customers.

For the first time, businesses will have the ability to deploy NVIDIA Quadro-grade professional graphics applications and accelerated computing on-premises, in the cloud through Azure, or via a hybrid of the two using both Windows and Linux virtual machines. Azure will also offer customers supercomputing-class performance, with the addition of the NVIDIA Tesla Accelerated Computing Platform's flagship Tesla K80 GPU accelerators, for the most computationally demanding data center and high performance computing (HPC) applications.

AMD and Dell Support Bioinformatics Studies at University of Warsaw in Poland

AMD today unveiled innovation in heterogeneous high performance computing (HPC) by delivering more than 1.5 petaFLOPS of AMD FirePro S9150 server GPU performance for the Next Generation Sequencing Centre (NGSC) at the Centre of New Technologies, University of Warsaw in support of bioinformatics research related to next generation sequencing (NGS) studies. The new ORION cluster features 150 Dell PowerEdge R730 servers with two AMD FirePro S9150 server GPUs, for a total GPU peak of 1.52 petaFLOPS single precision and 0.76 petaFLOPS double precision performance. The energy-efficient cluster enables high speed and efficient calculations for genomic data, applicable to a range of genomics and bioinformatics studies, using a fast and power efficient OpenCL implementation for research applications.

"We're committed to building our HPC leadership position in the industry as a foremost provider of computing applications, tools and technologies," said Sean Burke, corporate vice-president and general manager, AMD Professional Graphics. "This installation reaffirms AMD's leading role in HPC with the implementation of the AMD FirePro S9150 server GPUs in this 1.5 petaFLOPS supercomputer cluster. AMD and Dell are enabling OpenCL applications for critical science research usage for this cluster. AMD is proud to collaborate with Dell and NGSC to support such important life science and computer science research."

IBM, NVIDIA and Mellanox Launch Design Center for Big Data and HPC

IBM, in collaboration with NVIDIA and Mellanox, today announced the establishment of a POWER Acceleration and Design Center in Montpellier, France to advance the development of data-intensive research, industrial, and commercial applications. Born out of the collaborative spirit fostered by the OpenPOWER Foundation - a community co-founded in part by IBM, NVIDIA and Mellanox supporting open development on top of the POWER architecture - the new Center provides commercial and open-source software developers with technical assistance to enable them to develop high performance computing (HPC) applications.

Technical experts from IBM, NVIDIA and Mellanox will help developers take advantage of OpenPOWER systems leveraging IBM's open and licensable POWER architecture with the NVIDIA Tesla Accelerated Computing Platform and Mellanox InfiniBand networking solutions. These are the class of systems developed collaboratively with the U.S. Department of Energy for the next generation Sierra and Summit supercomputers and to be used by the United Kingdom's Science and Technology Facilities Council's Hartree Centre for big data research.

Seagate Announces ClusterStor Hadoop Workflow Accelerator

Seagate Technology plc, a world leader in storage solutions, today announced availability of the ClusterStor Hadoop Workflow Accelerator a new solution providing the tools, services, and support for High Performance Computing (HPC) customers who need the best performing storage systems for Big Data Analytics. The Hadoop Workflow Accelerator is a set of Hadoop optimization tools, services and support that leverages and enhances the performance of ClusterStor, the market leading scale-out storage system, designed for Big Data analysis.

Computationally intensive High Performance Data Analytics (HPDA) environments will benefit from significant reductions in data transfer time with the Hadoop Workflow Accelerator. This solution also includes the Hadoop on Lustre Connector, which allows both Hadoop and HPC Lustre clusters to use exactly the same data without having to move the data between file systems or storage devices.

Intel Reveals Details for Future HPC System Building Blocks

Intel Corporation todayannounced several new and enhanced technologies bolstering its leadership in high-performance computing (HPC). These include disclosure of the future generation Intel Xeon Phi processor, code-named Knights Hill, and new architectural and performance details for Intel Omni-Path Architecture, a new high-speed interconnect technology optimized for HPC deployments.

Intel also announced new software releases and collaborative efforts designed to make it easier for the HPC community to extract the full performance potential from current and future Intel industry-standard hardware. Together, these new HPC building blocks and industry collaborations will help to address the dual challenges of extreme scalability and mainstream use of HPC while providing the foundation for a cost-effective path to exascale computing.
Return to Keyword Browsing