News Posts matching #HPC

Return to Keyword Browsing

New NetCAT Vulnerability Exploits DDIO on Intel Xeon Processors to Steal Data

DDIO, or Direct Data I/O, is an Intel-exclusive performance enhancement that allows NICs to directly access a processor's L3 cache, completely bypassing the a server's RAM, to increase NIC performance and lower latencies. Cybersecurity researchers from the Vrije Universiteit Amsterdam and ETH Zurich, in a research paper published on Tuesday, have discovered a critical vulnerability with DDIO that allows compromised servers in a network to steal data from every other machine on its local network. This include the ability to obtain keystrokes and other sensitive data flowing through the memory of vulnerable servers. This effect is compounded in data centers that have not just DDIO, but also RDMA (remote direct memory access) enabled, in which a single server can compromise an entire network. RDMA is a key ingredient in shoring up performance in HPCs and supercomputing environments. Intel in its initial response asked customers to disable DDIO and RDMA on machines with access to untrusted networks, while it works on patches.

The NetCAT vulnerability spells big trouble for web hosting providers. If a hacker leases a server in a data-center with RDMA and DDIO enabled, they can compromise other customers' servers and steal their data. "While NetCAT is powerful even with only minimal assumptions, we believe that we have merely scratched the surface of possibilities for network-based cache attacks, and we expect similar attacks based on NetCAT in the future," the paper reads. We hope that our efforts caution processor vendors against exposing microarchitectural elements to peripherals without a thorough security design to prevent abuse." The team also published a video briefing the nature of NetCAT. AMD EPYC processors don't support DDIO.
The video detailing NetCAT follows.

GIGABYTE Smashes 11 World Records with New AMD EPYC 7002 Processors

GIGABYTE, a leading server systems builder which recently released a total of 17 new AMD EPYC 7002 Series "Rome" server platforms simultaneously with AMD's own official launch of their next generation CPU, is proud to announce that our new systems have already broken 11 different SPEC benchmark world records. These new world records have not only been achieved against results from all alternative processor based systems but even against competing vendor solutions using the same 2nd Generation AMD EPYC 7002 Series "Rome" processor platform, illustrating that GIGABYTE's system design and engineering is perfectly optimized to deliver the maximum performance possible from the 2nd Generation AMD EPYC.

2nd Gen AMD EPYC Processors Set New Standard for the Modern Datacenter

At a launch event today, AMD was joined by an expansive ecosystem of datacenter partners and customers to introduce the 2nd Generation AMD EPYC family of processors that deliver performance leadership across a broad number of enterprise, cloud and high-performance computing (HPC) workloads. 2nd Gen AMD EPYC processors feature up to 64 "Zen 2" cores in leading-edge 7 nm process technology to deliver record-setting performance while helping reduce total cost of ownership (TCO) by up to 50% across numerous workloads. At the event, Google and Twitter announced new 2nd Gen AMD EPYC processor deployments and HPE and Lenovo announced immediate availability of new platforms.

"Today, we set a new standard for the modern datacenter with the launch of our 2nd Gen AMD EPYC processors that deliver record-setting performance and significantly lower total cost of ownership across a broad set of workloads," said Dr. Lisa Su, president and CEO, AMD. "Adoption of our new leadership server processors is accelerating with multiple new enterprise, cloud and HPC customers choosing EPYC processors to meet their most demanding server computing needs."

NVIDIA Brings CUDA to ARM, Enabling New Path to Exascale Supercomputing

NVIDIA today announced its support for Arm CPUs, providing the high performance computing industry a new path to build extremely energy-efficient, AI-enabled exascale supercomputers. NVIDIA is making available to the Arm ecosystem its full stack of AI and HPC software - which accelerates more than 600 HPC applications and all AI frameworks - by year's end. The stack includes all NVIDIA CUDA-X AI and HPC libraries, GPU-accelerated AI frameworks and software development tools such as PGI compilers with OpenACC support and profilers. Once stack optimization is complete, NVIDIA will accelerate all major CPU architectures, including x86, POWER and Arm.

"Supercomputers are the essential instruments of scientific discovery, and achieving exascale supercomputing will dramatically expand the frontier of human knowledge," said Jensen Huang, founder and CEO of NVIDIA. "As traditional compute scaling ends, power will limit all supercomputers. The combination of NVIDIA's CUDA-accelerated computing and Arm's energy-efficient CPU architecture will give the HPC community a boost to exascale."

The EPI Announces Successful First Steps Towards a Made-in-Europe High-performance Microprocessor

The European Processor Initiative(EPI), crucial element of the European exascale strategy, delivers its first architectural design to the European Commission and welcomes new partners Almost six months in, the project that kicked off last December has already delivered its first architectural designs to the European Commission, thus marking initial milestones successfully executed. The project that will be the cornerstone of the EU's strategic plans in HPC initially brought together 23 partners from 10 European countries, but has now welcomed three more strong additions to its EPI family. EPI consortium aims to bring a low-power microprocessor to the market and ensure that the key competences for high-end chip design remain in Europe. The European Union's Horizon 2020 program funds this project with a special Framework Partnership Agreement. The initial stage is a three-year Specific Grant Agreement, which lasts until November 2021.

Samsung Successfully Completes 5nm EUV Development

Samsung Electronics Co., Ltd., a world leader in advanced semiconductor technology, today announced that its 5-nanometer (nm) FinFET process technology is complete in its development and is now ready for customers' samples. By adding another cutting-edge node to its extreme ultraviolet (EUV)-based process offerings, Samsung is proving once again its leadership in the advanced foundry market.

Compared to 7 nm, Samsung's 5 nm FinFET process technology provides up to a 25 percent increase in logic area efficiency with 20 percent lower power consumption or 10 percent higher performance as a result of process improvement to enable us to have more innovative standard cell architecture. In addition to power performance area (PPA) improvements from 7 nm to 5 nm, customers can fully leverage Samsung's highly sophisticated EUV technology. Like its predecessor, 5 nm uses EUV lithography in metal layer patterning and reduces mask layers while providing better fidelity.

Intel Announces Broadest Product Portfolio for Moving, Storing, and Processing Data

Intel Tuesday unveiled a new portfolio of data-centric solutions consisting of 2nd-Generation Intel Xeon Scalable processors, Intel Optane DC memory and storage solutions, and software and platform technologies optimized to help its customers extract more value from their data. Intel's latest data center solutions target a wide range of use cases within cloud computing, network infrastructure and intelligent edge applications, and support high-growth workloads, including AI and 5G.

Building on more than 20 years of world-class data center platforms and deep customer collaboration, Intel's data center solutions target server, network, storage, internet of things (IoT) applications and workstations. The portfolio of products advances Intel's data-centric strategy to pursue a massive $300 billion data-driven market opportunity.

Without Silicon, Intel Scores First Exascale Computer Design Win for Xe Graphics - AURORA Supercomputer

This here is an interesting piece of tech news for sure, in that Intel has already scored a pretty massive design win for not one, but two upcoming products. Intel's "Future Xeon Scalable Processors" and the company's "Xe Compute Architecture" have been tapped by the U.S. Department of Energy for incorporation into the new AURORA Supercomputer - one that will deliver exascale performance. AURORA is to be developed in a partnership between Intel and Cray, using the later's Shasta systems and its "Slingshot" networking fabric. But these are not the only Intel elements in the supercomputer design: Intel's DC Optane persistent memory will also be employed (in an as-of-yet-unavailable version of it as well), making this a full win across the prow for Intel.

AMD Says Not to Count on Exotic Materials for CPUs in the Next Ten Years, Silicon Is Still Computing's Best Friend

AMD's senior VP of AMD's datacentre group Forrest Norrod, at the Rice Oil and Gas HPC conference, said that while graphene does have incredible promise for the world of computing, it likely will take some ten years before such exotic material are actually taken advantage off. As Norrod puts it, silicon still has a pretty straightforward - if increasingly complex - path down to 3 nanometer densities. And according to him, at the rate manufacturers are being able to scale down their production nodes further, the average time between node transitions stands at some four or five years - which makes the jump to 5 nm and then 3 nm look exactly some 10 years from now, where Norrod expects to go through two additional shrinking nodes for the manufacturing process.

Of course, graphene is being hailed as the next best candidate for taking over silicon's place at the heart of our more complex, high-performance electronics, due, in part, to its high conductivity independent of temperature variation and its incredible switching resistance - it has been found to be able to operate at Terahertz switching speeds. It's a 2D material, which means that implementations of it will have to occur in deposited sheets of graphene across some other material.

Advantech Unveils New Lineup of SQRAM DDR4 32GB Unbuffered Memory for HPC

Advantech, a leading global flash storage and memory solutions provider in the embedded market, announces the industry's most comprehensive lineup of 32GB DDR4 unbuffered DIMM memory. Advantech SQRAM offers single 32GB DRAM modules in various DIMM types including SODIMM, UDIMM, ECC DIMM, and extremely robust Rugged DIMM with guaranteed wide temperature operation for high performance computing in applications such as networking and military.

As the global IoT market gradually embraces big data and edge computing, demand for high data and performance processing is increasing. SQRAM 32 GB unbuffered DIMM memory uses Samsung's 16 Gb 2666 MT/s IC chips for high reliability requirements in mission critical applications. SQRAM 32 GB wide temperature operation (-40~85 °C) Rugged DIMM offers extreme vibration resistance, plus ECC checking to ensure data accuracy.

Stuttgart-based HLRS to Build a Supercomputer with 10,000 64-core Zen 2 Processors

Höchstleistungsrechenzentrum (HLRS, or High-Performance Computing Center), based in Stuttgart Germany, is building a new cluster supercomputer powered by 10,000 AMD Zen 2 "Rome" 64-core processors, making up 640,000 cores. Called "Hawk," the supercomputer will be HLRS' flagship product, and will open its doors to business in 2019. The slide-deck for Hawk makes a fascinating disclosure about the processors it's based on.

Apparently, each of the 64-core "Rome" EPYC processors has a guaranteed clock-speed of 2.35 GHz. This would mean at maximum load (with all cores loaded 100%), the processor can manage to run at 2.35 GHz. This is important, because the supercomputer's advertised throughput is calculated on this basis, and clients draw up SLAs on throughput. The advertised peak throughput for the whole system is 24.06 petaFLOP/s, although the company is yet to put out nominal/guaranteed performance numbers (which it will only after first-hand testing). The system features 665 TB of RAM, and 26,000 TB of storage.

Intel Puts Out Additional "Cascade Lake" Performance Numbers

Intel late last week put out additional real-world HPC and AI compute performance numbers of its upcoming "Cascade Lake" 2x 48-core (96 cores in total) machine, compared to AMD's EPYC 7601 2x 32-core (64 cores in total) machine. You'll recall that on November 5th, the company put out Linpack, System Triad, and Deep Learning Inference numbers, which are all synthetic benchmarks. In a new set of slides, the company revealed a few real-world HPC/AI application performance numbers, including MIMD Lattice Computation (MILC), Weather Research and Forecasting (WRF), OpenFOAM, NAMD scalable molecular dynamics, and YaSK.

The Intel 96-core setup with 12-channel memory interface belts out up to 1.5X performance in MILC, up to 1.6X in WRF and OpenFOAM, up to 2.1X in NAMD, and up to 3.1X in YASK, compared to an AMD EPYC 7601 2P machine. The company also put out system configuration and disclaimer slides with the usual forward-looking CYA. "Cascake Lake" will be Intel's main competitor to AMD's EPYC "Rome" 64-core 4P-capable processor that comes out by the end of 2018. Intel's product is a multi-chip module of two 24~28 core dies, with a 2x 6-channel DDR4 memory interface.

AMD and Xilinx Announce a New World Record for AI Inference

At today's Xilinx Developer Forum in San Jose, Calif., our CEO, Victor Peng was joined by the AMD CTO Mark Papermaster for a Guinness. But not the kind that comes in a pint - the kind that comes in a record book. The companies revealed the AMD and Xilinx have been jointly working to connect AMD EPYC CPUs and the new Xilinx Alveo line of acceleration cards for high-performance, real-time AI inference processing. To back it up, they revealed a world-record 30,000 images per-second inference throughput!

The impressive system, which will be featured in the Alveo ecosystem zone at XDF today, leverages two AMD EPYC 7551 server CPUs with its industry-leading PCIe connectivity, along with eight of the freshly-announced Xilinx Alveo U250 acceleration cards. The inference performance is powered by Xilinx ML Suite, which allows developers to optimize and deploy accelerated inference and supports numerous machine learning frameworks such as TensorFlow. The benchmark was performed on GoogLeNet, a widely used convolutional neural network.

AMD Radeon Vega 12 and Vega 20 Listed in Ashes Of The Singularity Database

Back at Computex, AMD showed a demo of their Vega 20 graphics processor, which is produced using a refined 7 nanometer process. We also reported that the chip has a twice-as-wide memory interface, effectively doubling memory bandwidth, and alsomaximum memory capacity. The smaller process promises improvements to power efficiency, which could let AMD run the chip at higher frequencies for more performance compared to the 14 nanometer process of existing Vega.

As indicated by AMD during Computex, the 7 nanometer Vega is a product targeted at High Performance Compute (HPC) applications, with no plans to release it for gaming. As they clarified later, the promise of "7 nanometer for gamers" is for Navi, which follows the Vega architecture. It's even more surprising to see AOTS results for a non-gaming card - my guess is that someone was curious how well it would do in gaming.

Samsung Miniaturizes the Z-SSD to the M.2 Form-factor

Samsung unveiled an M.2 variant of its flagship high-performance Z-SSD. Targeted at workstations, HPCs, and AI servers, the Z-SSD lineup is built around Samsung's proprietary Z-NAND flash memory, that offer "up to 10 times" higher cell read performance than conventional 3D V-NAND (found on drives such as the 960 Pro). This performance is then traded off for the lowest possible latencies and response-times, which can help certain AI applications. The Z-SSD M.2 is built in the M.2-22110 (110 mm-long) form-factor, and features PCI-Express gen 3.0 x4 interface, and takes advantage of the NVMe protocol.

The drive appears to feature an 8-channel controller that's similar to the one that drives the company's PM983 SSD, and not quite the 16-channel controller found on the larger AIC variant of this drive. Available in capacities of 240 GB and 480 GB, the drive offers sequential transfer rates of up to 3200 MB/s reads, with up to 2800 MB/s writes; with an endurance of 30 DWPD. Like its larger siblings, the Z-SSD M.2 comes with a bank of capacitors to offer power-loss protection. The company didn't reveal availability or pricing information.

Italian Multinational Gas, Oil Company Fires Off HPC4 Supercomputer

Eni has launched its new HPC4 supercomputer, at its Green Data Center in Ferrera Erbognone, 60 km away from Milan. HPC4 quadruples the Company's computing power and makes it the world's most powerful industrial system. HPC4 has a peak performance of 18.6 Petaflops which, combined with the supercomputing system already in operation (HPC3), increases Eni's computational peak capacity to 22.4 Petaflops.

According to the latest official Top 500 supercomputers list published last November (the next list is due to be published in June 2018), Eni's HPC4 is the only non-governmental and non-institutional system ranking among the top ten most powerful systems in the world. Eni's Green Data Center has been designed as a single IT Infrastructure to host all of HPC's architecture and all the other Business applications.

Samsung Now Mass Producing Industry's First 2nd-Generation 10nm Class DRAM

Samsung Electronics Co., Ltd., the world leader in advanced memory technology, announced today that it has begun mass producing the industry's first 2nd-generation of 10-nanometer class (1y-nm), 8-gigabit (Gb) DDR4 DRAM. For use in a wide range of next-generation computing systems, the new 8 Gb DDR4 features the highest performance and energy efficiency for an 8 Gb DRAM chip, as well as the smallest dimensions.

"By developing innovative technologies in DRAM circuit design and process, we have broken through what has been a major barrier for DRAM scalability," said Gyoyoung Jin, president of Memory Business at Samsung Electronics. "Through a rapid ramp-up of the 2nd-generation 10 nm-class DRAM, we will expand our overall 10 nm-class DRAM production more aggressively, in order to accommodate strong market demand and continue to strengthen our business competitiveness."

PCI SIG Releases PCI-Express Gen 4.0 Specifications

The Peripheral Component Interconnect (PCI) special interest group (SIG) published the first official specification (version 1.0) of PCI-Express gen 4.0 bus. The specification's previous draft 0.9 was under technical review by members of the SIG. The new generation PCIe comes with double the bandwidth of PCI-Express gen 3.0, reduced latency, lane margining, and I/O virtualization capabilities. With the specification published, one can expect end-user products implementing it. PCI SIG has now turned its attention to the even newer PCI-Express gen 5.0 specification, which will be close to ready by mid-2019.

PCI-Express gen 4.0 comes with 16 GT/s bandwidth per-lane, per-direction, which is double that of gen 3.0. An M.2 NVMe drive implementing it, for example, will have 64 Gbps of interface bandwidth at its disposal. The SIG has also been steered toward lowering the latencies of the interconnect as HPC hardware designers are turning toward alternatives such as NVLink and InfinityFabric, not primarily for the bandwidth, but the lower latency. Lane margining is a new feature that allows hardware to maintain a uniform physical layer signal clarity across multiple PCIe devices connected to a common root complex. This is particularly important when you have multiple pieces of mission-critical hardware (such as RAID HBAs or HPC accelerators), and require uniform performance across them. The new specification also adds new I/O virtualization features that should prove useful in HPC and cloud computing.

NEC Launches Their SX-Aurora TSUBASA Vector Engine

NEC Corporation (TSE: 6701) today announced the launch of a new high-end HPC product line, the SX-Aurora TSUBASA. This new platform drastically increases processing performance and scalability on real world applications, aiming for the traditional application areas, such as science and engineering, but also targeting the new fields of Machine Learning, Artificial Intelligence and Big Data analytics. With this new technology, NEC opens supercomputing to a wide range of new markets, in addition to the traditional HPC arena.

Utilizing cutting-edge chip integration technology, the new product features a complete multi-core vector processor in the form of a card-type Vector Engine (VE), which is developed based on NEC's high-density interface technology and efficient cooling technology. Kimihiko Fukuda, Executive Vice President, NEC Corporation, said, "The new product addresses the needs of scalar computational capability while still providing the efficiency of a vector architecture. This is accomplished through a tightly integrated complete vector system in the form of a Vector Engine Card."

Samsung Increases Production of 8 GB HBM2 Memory

Samsung Electronics Co., Ltd., the world leader in advanced memory technology, today announced that it is increasing the production volume of its 8-gigabyte (GB) High Bandwidth Memory-2 (HBM2) to meet growing market needs across a wide range of applications including artificial intelligence, HPC (high-performance computing), advanced graphics, network systems and enterprise servers.

"By increasing production of the industry's only 8GB HBM2 solution now available, we are aiming to ensure that global IT system manufacturers have sufficient supply for timely development of new and upgraded systems," said Jaesoo Han, executive vice president, Memory Sales & Marketing team at Samsung Electronics. "We will continue to deliver more advanced HBM2 line-ups, while closely cooperating with our global IT customers."

GIGABYTE Releases First Wave Of Products Based On Skylake Purley Architecture

GIGABYTE today announced its latest generation of servers based on Intel's Skylake Purley architecture. This new generation brings a wealth of new options in scalability - across compute, network and storage - to deliver solutions for any application, from the enterprise to the data center to HPC. (Jump ahead to system introductions).

This server series adopts Intel's new product family - officially named the 'Intel Xeon Scalable family' and utilizes its ability to meet the increasingly diverse requirements of the industry, from entry-level HPC to large scale clusters.. The major development in this platform is around the improved features and functionality at both the host and fabric levels. These enable performance improvements - both natively on chip and for future extensibility through compute, network and storage peripherals. In practical terms, these new CPUs will offer up to 28 cores, and 48 PCIe lanes per socket.

NVIDIA Announces the Tesla V100 PCI-Express HPC Accelerator

NVIDIA formally announced the PCI-Express add-on card version of its flagship Tesla V100 HPC accelerator, based on its next-generation "Volta" GPU architecture. Based on the advanced 12 nm "GV100" silicon, the GPU is a multi-chip module with a silicon substrate and four HBM2 memory stacks. It features a total of 5,120 CUDA cores, 640 Tensor cores (specialized CUDA cores which accelerate neural-net building), GPU clock speeds of around 1370 MHz, and a 4096-bit wide HBM2 memory interface, with 900 GB/s memory bandwidth. The 815 mm² GPU has a gargantuan transistor-count of 21 billion. NVIDIA is taking institutional orders for the V100 PCIe, and the card will be available a little later this year. HPE will develop three HPC rigs with the cards pre-installed.

Could This be the NVIDIA TITAN Volta?

NVIDIA, which unveiled its faster "Volta" GPU architecture at its 2017 Graphics Technology Conference (GTC), beginning with the HPC product Tesla V100, is closer to launching the consumer graphics variant, the TITAN Volta. A curious-looking graphics card image with "TITAN" markings surfaced on Reddit. One could discount the pic for being that of a well-made cooler mod, until you take a peak at the PCB. It appears to lack SLI fingers where you'd expect them to be, and instead has NVLink fingers in positions found on the PCIe add-in card variant of the Tesla P100 HPC accelerator.

You might think "alright, it's not a fancy TITAN X Pascal cooler mod, but it could be a P100 with a cooler mod," until you notice the power connectors - it has two power inputs on top of the card (where they're typically found on NVIDIA's consumer graphics cards), and not the rear portion of the card (where the P100 has it, and where they're typically found on Tesla and Quadro series products). Whoever pulled this off has done an excellent job either way - of scoring a potential TITAN Volta sample, or modding whatever card to look very plausible of being a TITAN Volta.

NVIDIA's Volta Reportedly Poised for Anticipated, Early Q3 2017 Launch

According to a report from Chinese website MyDrivers, NVIDIA is looking to spruce things up on its line-up with a much earlier than expected Q3 Volta Launch. Remember that Volta was expected, according to NVIDIA's own road-maps, to launch around early 2018. The report indicates that NVIDIA's Volta products - apparently to be marketed as the GeForce 20-series - will see an early launch due to market demands, and NVIDIA's intention to further increase pricing of its products through a new-generation launch.

These stand, for now, as only rumors (and not the first time they've surfaced at that), but paint a pretty interesting picture, nonetheless. Like Intel with its Coffee Lake series, pushing a product launch to earlier than expected has consequences: production, logistics, infrastructure, product roadmaps, and stock of existing previous-generation products must all be taken into account. And with NVIDIA just recently having introduced its performance-champions GTX 1080 Ti and Titan Xp graphics cards, all of this seems a trigger pull too early - especially when taking into account the competition landscape in high-performance graphics, which is akin to a single green-colored banner poised atop the Himalayas. And NVIDIA must not forget the fact that AMD could be pulling a black swan off its engineering department with Vega, like it did with its Ryzen series of CPUs.

NVIDIA, Microsoft Launch Industry-Standard Hyperscale GPU Accelerator

NVIDIA with Microsoft today unveiled blueprints for a new hyperscale GPU accelerator to drive AI cloud computing. Providing hyperscale data centers with a fast, flexible path for AI, the new HGX-1 hyperscale GPU accelerator is an open-source design released in conjunction with Microsoft's Project Olympus.

HGX-1 does for cloud-based AI workloads what ATX -- Advanced Technology eXtended -- did for PC motherboards when it was introduced more than two decades ago. It establishes an industry standard that can be rapidly and efficiently embraced to help meet surging market demand. The new architecture is designed to meet the exploding demand for AI computing in the cloud -- in fields such as autonomous driving, personalized healthcare, superhuman voice recognition, data and video analytics, and molecular simulations.
Return to Keyword Browsing