News Posts matching #Triton

Return to Keyword Browsing

Meta Announces New MTIA AI Accelerator with Improved Performance to Ease NVIDIA's Grip

Meta has announced the next generation of its Meta Training and Inference Accelerator (MTIA) chip, which is designed to train and infer AI models at scale. The newest MTIA chip is a second-generation design of Meta's custom silicon for AI, and it is being built on TSMC's 5 nm technology. Running at the frequency of 1.35 GHz, the new chip is getting a boost to 90 Watts of TDP per package compared to just 25 Watts for the first-generation design. Basic Linear Algebra Subprograms (BLAS) processing is where the chip shines, and it includes matrix multiplication and vector/SIMD processing. At GEMM matrix processing, each chip can process 708 TeraFLOPS at INT8 (presumably meant FP8 in the spec) with sparsity, 354 TeraFLOPS without, 354 TeraFLOPS at FP16/BF16 with sparsity, and 177 TeraFLOPS without.

Classical vector and processing is a bit slower at 11.06 TeraFLOPS at INT8 (FP8), 5.53 TeraFLOPS at FP16/BF16, and 2.76 TFLOPS single-precision FP32. The MTIA chip is specifically designed to run AI training and inference on Meta's PyTorch AI framework, with an open-source Triton backend that produces compiler code for optimal performance. Meta uses this for all its Llama models, and with Llama3 just around the corner, it could be trained on these chips. To package it into a system, Meta puts two of these chips onto a board and pairs them with 128 GB of LPDDR5 memory. The board is connected via PCIe Gen 5 to a system where 12 boards are stacked densely. This process is repeated six times in a single rack for 72 boards and 144 chips in a single rack for a total of 101.95 PetaFLOPS, assuming linear scaling at INT8 (FP8) precision. Of course, linear scaling is not quite possible in scale-out systems, which could bring it down to under 100 PetaFLOPS per rack.
Below, you can see images of the chip floorplan, specifications compared to the prior version, as well as the system.

Acer Unleashes New Predator Triton Neo 16 with Intel Core Ultra Processors

Acer today announced the new Predator Triton Neo 16 (PTN16-51) gaming laptop, designed with the new Intel Core Ultra processors with dedicated AI acceleration capabilities and NVIDIA GeForce RTX 40 Series GPUs that support demanding games and creative applications. Players and content creators can marvel at enhanced video game scenes and designs on the laptop's 16-inch display with up to a stunning 3.2K resolution and 165 Hz refresh rate and Calman-Verified displays, producing accurate colors right out-of-the-box.

The state-of-the-art cooling system combines a 5th Gen AeroBlade fan and liquid metal thermal grease on the CPU to keep the laptop running at full steam, while users stay on top of communications and device management thanks to the AI-enhanced Acer PurifiedVoice 2.0 software and the PredatorSense utility app. This Windows 11 gaming PC also provides players with amazing performance experiences and one month of Xbox Game Pass for access to hundreds of high-quality PC games.

NVIDIA Triton Inference Server Running A100 Tensor Core GPUs Boosts Bing Advert Delivery

Inference software enables shift to NVIDIA A100 Tensor Core GPUs, delivering 7x throughput for the search giant. Jiusheng Chen's team just got accelerated. They're delivering personalized ads to users of Microsoft Bing with 7x throughput at reduced cost, thanks to NVIDIA Triton Inference Server running on NVIDIA A100 Tensor Core GPUs. It's an amazing achievement for the principal software engineering manager and his crew.

Tuning a Complex System
Bing's ad service uses hundreds of models that are constantly evolving. Each must respond to a request within as little as 10 milliseconds, about 10x faster than the blink of an eye. The latest speedup got its start with two innovations the team delivered to make AI models run faster: Bang and EL-Attention. Together, they apply sophisticated techniques to do more work in less time with less computer memory. Model training was based on Azure Machine Learning for efficiency.

Acer Introduces New Predator Triton 17 X and Predator Helios Neo 16 Gaming Laptops

Acer today announced major updates to its gaming laptop line-up, with new designs and support for industry-leading technology: 13th Gen Intel Core processors and NVIDIA GeForce RTX 40 Series GPUs. The new Predator Triton 17 X and Predator Helios Neo 16 are made for gamers and creators alike, due to their powerful components and premium display options. The thin and powerful Predator Triton 14 is designed for work or play, while its sleek, understated design makes it ideal for any environment. Lastly, the Predator Helios 3D 15 SpatialLabs Edition brings glasses-free, stereoscopic 3D to the world of gaming, infusing new life into more than 70 modern and classic titles, with more being continually added.

Acer also announced new Predator Triton 14 and Predator Helios 3D 15 SpatialLabs Edition laptops with 13th Gen Intel Core processors and NVIDIA GeForce RTX 40 Series GPUs.
Return to Keyword Browsing
May 13th, 2024 18:45 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts