News Posts matching #Machine Learning

Return to Keyword Browsing

Intel and Weizmann Institute Speed AI with Speculative Decoding Advance

Press Release by

Thursday, 09:04 Discuss (1 Comment)

At the International Conference on Machine Learning (ICML), researchers from Intel Labs and the Weizmann Institute of Science introduced a major advance in speculative decoding. The new technique, presented at the conference in Vancouver, Canada, enables any small "draft" model to accelerate any large language model (LLM) regardless of vocabulary differences. "We have solved a core inefficiency in generative AI. Our research shows how to turn speculative acceleration into a universal tool. This isn't just a theoretical improvement; these are practical tools that are already helping developers build faster and smarter applications today," said Oren Pereg, senior researcher, Natural Language Processing Group, Intel Labs.

Speculative decoding is an inference optimization technique designed to make LLMs faster and more efficient without compromising accuracy. It works by pairing a small, fast model with a larger, more accurate one, creating a "team effort" between models. Consider the prompt for an AI model: "What is the capital of France…" A traditional LLM generates each word step by step. It fully computes "Paris," then "a", then "famous", then "city" and so on, consuming significant resources at each step. With speculative decoding, the small assistant model quickly drafts the full phrase "Paris, a famous city…" The large model then verifies the sequence. This dramatically reduces the compute cycles per output token.

Read full story

AnythingLLM App Best Experienced on NVIDIA RTX AI PCs

Press Release by

May 29th, 2025 12:02 Discuss (0 Comments)

Large language models (LLMs), trained on datasets with billions of tokens, can generate high-quality content. They're the backbone for many of the most popular AI applications, including chatbots, assistants, code generators and much more. One of today's most accessible ways to work with LLMs is with AnythingLLM, a desktop app built for enthusiasts who want an all-in-one, privacy-focused AI assistant directly on their PC. With new support for NVIDIA NIM microservices on NVIDIA GeForce RTX and NVIDIA RTX PRO GPUs, AnythingLLM users can now get even faster performance for more responsive local AI workflows.

What Is AnythingLLM?
AnythingLLM is an all-in-one AI application that lets users run local LLMs, retrieval-augmented generation (RAG) systems and agentic tools. It acts as a bridge between a user's preferred LLMs and their data, and enables access to tools (called skills), making it easier and more efficient to use LLMs for specific tasks.

Read full story

NVIDIA's Project G-Assist Plug-In Builder Explained: Anyone Can Customize AI on GeForce RTX AI PCs

Press Release by

Apr 23rd, 2025 12:48 Discuss (3 Comments)

AI is rapidly reshaping what's possible on a PC—whether for real-time image generation or voice-controlled workflows. As AI capabilities grow, so does their complexity. Tapping into the power of AI can entail navigating a maze of system settings, software and hardware configurations. Enabling users to explore how on-device AI can simplify and enhance the PC experience, Project G-Assist—an AI assistant that helps tune, control and optimize GeForce RTX systems—is now available as an experimental feature in the NVIDIA app. Developers can try out AI-powered voice and text commands for tasks like monitoring performance, adjusting settings and interacting with supporting peripherals. Users can even summon other AIs powered by GeForce RTX AI PCs.

And it doesn't stop there. For those looking to expand Project G-Assist capabilities in creative ways, the AI supports custom plug-ins. With the new ChatGPT-based G-Assist Plug-In Builder, developers and enthusiasts can create and customize G-Assist's functionality, adding new commands, connecting external tools and building AI workflows tailored to specific needs. With the plug-in builder, users can generate properly formatted code with AI, then integrate the code into G-Assist—enabling quick, AI-assisted functionality that responds to text and voice commands.

Read full story

5th Gen AMD EPYC Processors Deliver Leadership Performance for Google Cloud C4D and H4D Virtual Machines

Press Release by

Apr 9th, 2025 08:53 Discuss (1 Comment)

Today, AMD announced the new Google Cloud C4D and H4D virtual machines (VMs) are powered by 5th Gen AMD EPYC processors. The latest additions to Google Cloud's general-purpose and HPC-optimized VMs deliver leadership performance, scalability, and efficiency for demanding cloud workloads; for everything from data analytics and web serving to high-performance computing (HPC) and AI.

Google Cloud C4D instances deliver impressive performance, efficiency, and consistency for general-purpose computing workloads and AI inference. Based on Google Cloud's testing, leveraging the advancements of the AMD "Zen 5" architecture allowed C4D to deliver up to 80% higher throughput/vCPU compared to previous generations. H4D instances, optimized for HPC workloads, feature AMD EPYC CPUs with Cloud RDMA for efficient scaling of up to tens of thousands of cores.

Read full story

Electronic Arts

EA Presented AI-enriched Development Tools at GDC 2025

Press Release by

Mar 25th, 2025 12:19 Discuss (3 Comments)

Technology and its impact on interactive entertainment experiences is nothing short of remarkable. AI and machine learning are allowing our creators to explore new ways to test games, build worlds, and create deeper, more immersive experiences at a faster rate without sacrificing quality. Our artists are empowered to create more realistic, dynamic animations and visuals at a greater scale, delivering more cinematic moments that draw us in and leave us stunned. Or completely immersed in the most enthralling environments possible, surrounded by and interacting with authentic, high-fidelity characters.

Above all, these advancements connect and offer millions of EA's players and fans around the globe new ways to play, create, watch, and connect in and beyond our games like never before. At the recently concluded Game Developers Conference (GDC), EA's brilliant minds showcased some of these technological innovations, highlighting how they empower our developers to craft extraordinary interactive entertainment experiences. Their talks provided insight into the cutting-edge tools and techniques driving the future of game development at EA, and the positive impact this will have on our developers, players and fans across the globe.

Read full story

AMD Introduces GAIA - an Open-Source Project That Runs Local LLMs on Ryzen AI NPUs

Press Release by

Mar 21st, 2025 14:34 Discuss (30 Comments)

AMD has launched a new open-source project called, GAIA (pronounced /ˈɡaɪ.ə/), an awesome application that leverages the power of Ryzen AI Neural Processing Unit (NPU) to run private and local large language models (LLMs). In this blog, we'll dive into the features and benefits of GAIA, while introducing how you can take advantage of GAIA's open-source project to adopt into your own applications.

Introduction to GAIA
GAIA is a generative AI application designed to run local, private LLMs on Windows PCs and is optimized for AMD Ryzen AI hardware (AMD Ryzen AI 300 Series Processors). This integration allows for faster, more efficient processing - i.e. lower power- while keeping your data local and secure. On Ryzen AI PCs, GAIA interacts with the NPU and iGPU to run models seamlessly by using the open-source Lemonade (LLM-Aid) SDK from ONNX TurnkeyML for LLM inference. GAIA supports a variety of local LLMs optimized to run on Ryzen AI PCs. Popular models like Llama and Phi derivatives can be tailored for different use cases, such as Q&A, summarization, and complex reasoning tasks.

Read full story

AMD Recommends EPYC Processors for Everyday AI Server Tasks

Press Release by

Mar 12th, 2025 13:21 Discuss (8 Comments)

Ask a typical IT professional today whether they're leveraging AI, and there's a good chance they'll say yes-after all, they have reputations to protect! Kidding aside, many will report that their teams may use Web-based tools like ChatGPT or even have internal chatbots that serve their employee base on their intranet, but for that not much AI is really being implemented at the infrastructure level. As it turns out, the true answer is a bit different. AI tools and techniques have embedded themselves firmly into standard enterprise workloads and are a more common, everyday phenomena than even many IT people may realize. Assembly line operations now include computer vision-powered inspections. Supply chains use AI for demand forecasting making business move faster and of course, AI note-taking and meeting summary is embedded on virtually all the variants of collaboration and meeting software.

Increasingly, critical enterprise software tools incorporate built-in recommendation systems, virtual agents or some other form of AI-enabled assistance. AI is truly becoming a pervasive, complementary tool for everyday business. At the same time, today's enterprises are navigating a hybrid landscape where traditional, mission-critical workloads coexist with innovative AI-driven tasks. This "mixed enterprise and AI" workload environment calls for infrastructure that can handle both types of processing seamlessly. Robust, general-purpose CPUs like the AMD EPYC processors are designed to be powerful and secure and flexible to address this need. They handle everyday tasks—running databases, web servers, ERP systems—and offer strong security features crucial for enterprise operations augmented with AI workloads. In essence, modern enterprise infrastructure is about creating a balanced ecosystem. AMD EPYC CPUs play a pivotal role in creating this balance, delivering high performance, efficiency, and security features that underpin both traditional enterprise workloads and advanced AI operations.

Read full story

Electronic Arts

EA Details How ML & AI Bolstered Development of Latest Madden & College Football Titles

Press Release by

Mar 3rd, 2025 13:46 Discuss (6 Comments)

On June 1, 1988, the very first Madden video game was released to the world. Players needed to load up either a Commodore 64/Commodore 128, Apple II, or MS-DOS to launch the game. When they did, they were greeted with 8-bit animations of the NFL's most popular teams and found themselves controlling their favorite players to try and win themselves a Super Bowl. And at that time, it was amazing. Thirty-seven years later and EA SPORTS hasn't stopped advancing Madden and our American Football games.

Most recently, we launched EA SPORTS Madden NFL 25 and College Football 25, which are tentpoles of our beloved American Football Ecosystem. Yet our football games are no longer blocky pixels and four-directional controls. They're among the most realistic sports simulation titles on the planet. We even celebrated the recent Super Bowl weekend with these titles and our very own Madden Bowl, featuring championship games and incredible music all in the heart of New Orleans. This is in no small part due to the incredible teams and their mission to make our games better every single year. And technology plays a critical role in making this happen.

Read full story

NVIDIA Outlines Cost Benefits of Inference Platform

Press Release by

Jan 27th, 2025 07:54 Discuss (8 Comments)

Businesses across every industry are rolling out AI services this year. For Microsoft, Oracle, Perplexity, Snap and hundreds of other leading companies, using the NVIDIA AI inference platform—a full stack comprising world-class silicon, systems and software—is the key to delivering high-throughput and low-latency inference and enabling great user experiences while lowering cost. NVIDIA's advancements in inference software optimization and the NVIDIA Hopper platform are helping industries serve the latest generative AI models, delivering excellent user experiences while optimizing total cost of ownership. The Hopper platform also helps deliver up to 15x more energy efficiency for inference workloads compared to previous generations.

AI inference is notoriously difficult, as it requires many steps to strike the right balance between throughput and user experience. But the underlying goal is simple: generate more tokens at a lower cost. Tokens represent words in a large language model (LLM) system—and with AI inference services typically charging for every million tokens generated, this goal offers the most visible return on AI investments and energy used per task. Full-stack software optimization offers the key to improving AI inference performance and achieving this goal.

Read full story

EK Unveils New Fluid Works CASCADE 4U8G Barebone in Collaboration With SilverStone

Press Release by

Jan 8th, 2025 12:32 Discuss (5 Comments)

EK, the leader in liquid cooling solutions, is proud to announce the EK Fluid Works CASCADE 4U8G Barebone, a revolutionary liquid-cooling rack-mount workstation solution developed in collaboration with SilverStone Technology Co., Ltd.. This marks a pivotal expansion of EK's Enterprise product line, designed to meet the growing demand for compact, scalable, and high-performance computing setups for professional and enthusiast applications alike.

Revolutionizing High-Performance Computing
The EK Fluid Works CASCADE 4U8G Barebone is a compact 4U rack-mount workstation engineered to support up to 8 liquid-cooled GPUs, including the new NVIDIA GeForce RTX 50 Series GPUs, doubling the capacity of traditional air-cooled solutions. Leveraging advanced liquid cooling designed by EK, the system ensures superior thermal performance, enabling sustained peak performance and efficiency across demanding workloads such as AI, machine learning, 3D rendering, and scientific simulations.

Read full story

Timekettle W4 Pro Earbuds Featuring Babel OS Launches Real-time 2-Way Call Translation, Enabling Natural Cross-lingual Conversations

CES Press Release by

Jan 7th, 2025 14:19 Discuss (1 Comment)

Timekettle, a global leader in AI-powered communication technology, is thrilled to introduce the enhanced W4 Pro Earbuds with bidirectional call functionality. Powered by its first proprietary software system, also announced at CES, Babel OS, the new W4 Pro sets a new benchmark for cross-language communication, offering seamless two-way real-time translation during phone and video calls on any communication platform.

The newly launched feature allows two-way translations to facilitate effortless cross-lingual conversations through any telecommunication application or traditional phone system. It provides bidirectional synchronous translations without disrupting the original voice quality.

Read full story

AMD, Broadcom, Cisco, Google, HPE, Intel, Meta and Microsoft Form Ultra Accelerator Link (UALink) Promoter Group to Combat NVIDIA NVLink

Press Release by

May 30th, 2024 11:51 Discuss (17 Comments)

AMD, Broadcom, Cisco, Google, Hewlett Packard Enterprise (HPE), Intel, Meta and Microsoft today announced they have aligned to develop a new industry standard dedicated to advancing high-speed and low latency communication for scale-up AI systems linking in Data Centers.

Called the Ultra Accelerator Link (UALink), this initial group will define and establish an open industry standard that will enable AI accelerators to communicate more effectively. By creating an interconnect based upon open standards, UALink will enable system OEMs, IT professionals and system integrators to create a pathway for easier integration, greater flexibility and scalability of their AI-connected data centers.

Read full story

SpiNNcloud Systems Announces First Commercially Available Neuromorphic Supercomputer

Press Release by

May 8th, 2024 10:19 Discuss (6 Comments)

Today, in advance of ISC High Performance 2024, SpiNNcloud Systems announced the commercial availability of its SpiNNaker2 platform, a supercomputer-level hybrid AI high-performance computer system based on principles of the human brain. Pioneered by Steve Furber, designer of the original ARM and SpiNNaker1 architectures, the SpiNNaker2 supercomputing platform uses a large number of low-power processors for efficiently computing AI and other workloads.

First-generation SpiNNaker1 architecture is currently used in dozens of research groups across 23 countries worldwide. Sandia National Laboratories, Technical University of München and Universität Göttingen are among the first customers placing orders for SpiNNaker2, which was developed around commercialized IP invented in the Human Brain Project, a billion-euro research project funded by the European Union to design intelligent, efficient artificial systems.

Read full story

Unreal Engine 5.4 is Now Available With Improvements to Nanite, AI and Machine Learning, TSR, and More

Press Release by

Apr 24th, 2024 02:30 Discuss (19 Comments)

Unreal Engine 5.4 is here, and it's packed with new features and improvements to performance, visual fidelity, and productivity that will benefit game developers and creators across industries. With this release, we're delivering the toolsets we've been using internally to build and ship Fortnite Chapter 5, Rocket Racing, Fortnite Festival, and LEGO Fortnite. Here are some of the highlights.

Animation
Character rigging and animation authoring
This release sees substantial updates to Unreal Engine's built-in animation toolset, enabling you to quickly, easily, and enjoyably rig characters and author animation directly in engine, without the frustrating and time-consuming need to round trip to external applications. With an Experimental new Modular Control Rig feature, you can build animation rigs from understandable modular parts instead of complex granular graphs, while Automatic Retargeting makes it easier to get great results when reusing bipedal character animations. There are also extensions to the Skeletal Editor and a suite of new deformer functions to make the Deformer Graph more accessible.

Read full story

ASUS Unveils ProArt Display and PC Solutions for AI, XR, and Virtual Production at NAB Show 2024

Press Release by

Apr 12th, 2024 06:21 Discuss (19 Comments)

ASUS today announced its participation in the upcoming NAB Show 2024, themed "A Glimpse into Tomorrow's Tech." Visitors to the ASUS booth (C2934, LVCC Central Hall, Exhibition Time: April 14-17, 2024) will have the opportunity to explore the future of content creation with hands-on experiences and demonstrations of groundbreaking innovations. The ASUS showcase will highlight a range of cutting-edge technology ideal for the challenges of XR and virtual production, including the ProArt Display PA32KCX, the world's first 8K Mini LED professional monitor, in addition to AI-powered workstations and color management solutions.

Shooting in 8K gives creators at the cutting-edge of XR and virtual production exceptionally high-resolution footage, affording them wide flexibility in post-production without sacrificing the ability to produce final products in a full 4K resolution. To allow these creators to work with raw 8K footage in full detail, ASUS today announced the ProArt Display PA32KCX, the world's first 8K Mini LED professional monitor. This 32-inch 8K (7680 x 4320) offers an average ΔE value of less than one for world-leading color accuracy, and it covers 97% of the cinema-grade DCI-P3 color gamut for beautifully saturated color reproduction. With its 4096-zone Mini LED backlight capable of 1200 nits peak brightness and industry-leading 1000 nits full-screen sustained brightness, the ProArt Display PA32KCX is an ideal candidate for HDR workflows. It supports multiple HDR metadata formats, including HLG and HDR10, allowing creators to check how content will appear for a wide range of target displays before it is sent off for final delivery.

Read full story

Microsoft Auto-updating Eligible Windows 11 PCs to Version 23H2

by

Feb 21st, 2024 12:39 Discuss (68 Comments)

Windows 11 version 23H2 started rolling out last October, but many users of Microsoft's flagship operating system opted out of an upgrade, thanks to a handy "optional" toggle. News outlets have latched onto a freshly published (February 20) Windows 11 "Release Health" notice—the official Microsoft dashboard alert states that Windows 11 2023 Update: "is now entering a new rollout phase." Fastidious users will not be happy to discover that "eligible Windows 11 devices" are now subject to an automatic bump up to version 23H2. Very passive-aggressive tactics have been utilized in the past—Microsoft is seemingly eager to get it audience upgraded onto its latest and greatest feature-rich experience.

According to NeoWin, an official announcement from last week alerted users to an "impending end of optional preview updates on Windows 11 22H2." Yesterday's "23H2" dashboard confessional provided a little bit more context—unsurprisingly involving artificial intelligence: "This automatic update targets Windows 11 devices that have reached or are approaching end of servicing, and it follows the machine learning-based (ML) training we have utilized so far. We will continue to train our intelligent ML model to safely roll out this new Windows version in phases to deliver a smooth update experience."

Read full story

AMD Radeon Graphics

AMD ROCm 6.0 Adds Support for Radeon PRO W7800 & RX 7900 GRE GPUs

Press Release by

Feb 15th, 2024 09:21 Discuss (2 Comments)

Building on our previously announced support of the AMD Radeon RX 7900 XT, XTX and Radeon PRO W7900 GPUs with AMD ROCm 5.7 and PyTorch, we are now expanding our client-based ML Development offering, both from the hardware and software side with AMD ROCm 6.0. Firstly, AI researchers and ML engineers can now also develop on Radeon PRO W7800 and on Radeon RX 7900 GRE GPUs. With support for such a broad product portfolio, AMD is helping the AI community to get access to desktop graphics cards at even more price points and at different performance levels.

Furthermore, we are complementing our solution stack with support for ONNX Runtime. ONNX, short for Open Neural Network Exchange, is an intermediary Machine Learning framework used to convert AI models between different ML frameworks. As a result, users can now perform inference on a wider range of source data on local AMD hardware. This also adds INT8 via MIGraphX—AMD's own graph inference engine—to the available data types (including FP32 and FP16). With AMD ROCm 6.0, we are continuing our support for the PyTorch framework bringing mixed precision with FP32/FP16 to Machine Learning training workflows.

Read full story

Nubis Communications and Alphawave Semi Showcase First Demonstration of Optical PCI Express 6.0 Technology

Press Release by

Feb 6th, 2024 03:37 Discuss (5 Comments)

Nubis Communications, Inc., provider of low-latency high-density optical inter-connect (HDI/O), and Alphawave Semi (LN: AWE), a global leader in high-speed connectivity and compute silicon for the world's technology infrastructure, today announced their upcoming demonstration of PCI Express 6.0 technology driving over an optical link at 64GT/s per lane. Data Center providers are exploring the use of PCIe over Optics to greatly expand the reach and flexibility of the interconnect for memory, CPUs, GPUs, and custom silicon accelerators to enable more scalable and energy-efficient clusters for Artificial Intelligence and Machine Learning (ML/AI) architectures.

Nubis Communications and Alphawave Semi will be showing a live demonstration in the Tektronix booth at DesignCon, the leading conference for advanced chip, board, and system design technologies. An Alphawave Semi PCIe Subsystem with PiCORE Controller IP and PipeCORE PHY will directly drive and receive PCIe 6.0 traffic through a Nubis XT1600 linear optical engine to demonstrate a PCIe 6.0 optical link at 64GT/s per fiber, with optical output waveform measured on a Tektronix sampling scope with a high-speed optical probe.

Read full story

Nuvoton Unveils New Production-Ready Endpoint AI Platform for Machine Learning

Press Release by

Jan 5th, 2024 09:23 Discuss (1 Comment)

Nuvoton is pleased to announce its new Endpoint AI Platform to accelerate the development of fully-featured microcontroller (MCU) AI products. These solutions are enabled by Nuvoton's powerful new MCU and MPU silicon, including the NuMicro M55M1 equipped with Ethos U55 NPU, NuMicro MA35D1, and NuMicro M467 series. These MCUs are a valuable addition to the modern AI-centric computing toolkit and demonstrate how Nuvoton continues to work closely with Arm and other companies to develop a user-friendly and complete Endpoint AI Ecosystem.

Development on these platforms is made easy by Nuvoton's NuEdgeWise: a well-rounded, simple-to-adopt tool for machine learning (ML) development, which is nonetheless suitable for cutting-edge tasks. Together, this powerful core hardware, combined with unique rich development tools, cements Nuvoton's reputation as a leading microcontroller platform provider. These new single-chip-based platforms are ideal for applications including smart home appliances and security, smart city services, industry, agriculture, entertainment, environmental protection, education, highly accurate voice-control tasks, and sports, health, and fitness.

Read full story

MAINGEAR Unveils Powerful Workstation PCs Designed for Creatives and Professionals

Press Release by

Nov 21st, 2023 12:03 Discuss (5 Comments)

MAINGEAR, the leader in premium-quality, high-performance, custom PCs, today announced the launch of its latest lineup of Pro Series Workstation PCs, meticulously engineered and configurable with the industry's most powerful components, to cater to the diverse needs of professionals across multiple industries.

Ideal for game developers, photo editors, graphics designers, videographers, 3D rendering artists, music producers, CAD engineers, data scientists, and AI/Machine Learning developers, the MAINGEAR ProWS Series introduces a range of desktop workstations crafted to crush the most intensive tasks, elevate productivity and streamline workflow.

Read full story

NVIDIA Introduces Generative AI Foundry Service on Microsoft Azure for Enterprises and Startups Worldwide

Press Release by

Nov 15th, 2023 10:28 Discuss (0 Comments)

NVIDIA today introduced an AI foundry service to supercharge the development and tuning of custom generative AI applications for enterprises and startups deploying on Microsoft Azure.

The NVIDIA AI foundry service pulls together three elements—a collection of NVIDIA AI Foundation Models, NVIDIA NeMo framework and tools, and NVIDIA DGX Cloud AI supercomputing services—that give enterprises an end-to-end solution for creating custom generative AI models. Businesses can then deploy their customized models with NVIDIA AI Enterprise software to power generative AI applications, including intelligent search, summarization and content generation.

Read full story

Google Introduces Cloud TPU v5e and Announces A3 Instance Availability

Press Release by

Sep 1st, 2023 02:12 Discuss (6 Comments)

We're at a once-in-a-generation inflection point in computing. The traditional ways of designing and building computing infrastructure are no longer adequate for the exponentially growing demands of workloads like generative AI and LLMs. In fact, the number of parameters in LLMs has increased by 10x per year over the past five years. As a result, customers need AI-optimized infrastructure that is both cost effective and scalable.

For two decades, Google has built some of the industry's leading AI capabilities: from the creation of Google's Transformer architecture that makes gen AI possible, to our AI-optimized infrastructure, which is built to deliver the global scale and performance required by Google products that serve billions of users like YouTube, Gmail, Google Maps, Google Play, and Android. We are excited to bring decades of innovation and research to Google Cloud customers as they pursue transformative opportunities in AI. We offer a complete solution for AI, from computing infrastructure optimized for AI to the end-to-end software and services that support the full lifecycle of model training, tuning, and serving at global scale.

Read full story

AMD's CTO Discusses Founding of Ultra Ethernet Consortium

Press Release by

Jul 20th, 2023 08:00 Discuss (4 Comments)

Mark Papermaster, AMD's Chief Technology Officer and Executive Vice President of Technology and Engineering announced: "Over the past 50 years, Ethernet has grown to dominate general networking. One of its key strengths is flexibility - the ability to adapt to different workloads, scale and computing environments. One of the places that it hasn't been well-known, though, is in high-performance networking environments.

Now, the Ultra Ethernet Consortium (UEC) was formed by leading technology companies to focus on tuning the Ethernet foundation for high-performance Artificial Intelligence, Machine Learning, and High-Performance Computing (AI/ML/HPC) workloads. This includes work at the Physical, Link, Transport, and Software layers with robust security and congestion protections.

Read full story

NVIDIA Triton Inference Server Running A100 Tensor Core GPUs Boosts Bing Advert Delivery

Press Release by

Jun 6th, 2023 13:15 Discuss (9 Comments)

Inference software enables shift to NVIDIA A100 Tensor Core GPUs, delivering 7x throughput for the search giant. Jiusheng Chen's team just got accelerated. They're delivering personalized ads to users of Microsoft Bing with 7x throughput at reduced cost, thanks to NVIDIA Triton Inference Server running on NVIDIA A100 Tensor Core GPUs. It's an amazing achievement for the principal software engineering manager and his crew.

Tuning a Complex System
Bing's ad service uses hundreds of models that are constantly evolving. Each must respond to a request within as little as 10 milliseconds, about 10x faster than the blink of an eye. The latest speedup got its start with two innovations the team delivered to make AI models run faster: Bang and EL-Attention. Together, they apply sophisticated techniques to do more work in less time with less computer memory. Model training was based on Azure Machine Learning for efficiency.

Read full story

Apple Introduces M2 Ultra

Press Release by

Jun 6th, 2023 01:51 Discuss (23 Comments)

Apple today announced M2 Ultra, a new system on a chip (SoC) that delivers huge performance increases to the Mac and completes the M2 family. M2 Ultra is the largest and most capable chip Apple has ever created, and it makes the new Mac Studio and Mac Pro the most powerful Mac desktops ever made. M2 Ultra is built using a second-generation 5-nanometer process and uses Apple's groundbreaking UltraFusion technology to connect the die of two M2 Max chips, doubling the performance. M2 Ultra consists of 134 billion transistors—20 billion more than M1 Ultra. Its unified memory architecture supports up to a breakthrough 192 GB of memory capacity, which is 50 percent more than M1 Ultra, and features 800 GB/s of memory bandwidth—twice that of M2 Max. M2 Ultra features a more powerful CPU that's 20 percent faster than M1 Ultra, a larger GPU that's up to 30 percent faster, and a Neural Engine that's up to 40 percent faster. It also features a media engine with twice the capabilities of M2 Max for blazing ProRes acceleration. With all these advancements, M2 Ultra takes Mac performance to a whole new level yet again.

"M2 Ultra delivers astonishing performance and capabilities for our pro users' most demanding workflows, while maintaining Apple silicon's industry-leading power efficiency," said Johny Srouji, Apple's senior vice president of Hardware Technologies. "With huge performance gains in the CPU, GPU, and Neural Engine, combined with massive memory bandwidth in a single SoC, M2 Ultra is the world's most powerful chip ever created for a personal computer."

Read full story

Return to Keyword Browsing

Jul 19th, 2025 12:24 CDT change timezone

Latest GPU Drivers

New Forum Posts

12:23 by Fuzzi0n
Upgrade from old x58 system (17)
12:15 by Bill_Bright
Are UPS lithium LiFePO4 batteries finally as cheap as lead-acid? (31)
12:13 by lexluthermiester
TPU's Nostalgic Hardware Club (20547)
12:12 by unclewebb
14900k high voltage (12)
11:46 by ShrimpBrime
No offense, here are some things that bother me about your understanding of fans. (161)
11:37 by avidgamer121
What's your latest tech purchase? (24322)
11:26 by kokos76
Inno3D RTX 5070 X3 vs X2 – thermals, noise, and fan control? (0)
11:12 by agent_x007
Idle issue since 5060 ti installed (32)
11:04 by W3RN3R
9060 XT 8GB or 5060 8GB? (40)
11:01 by unclewebb
Multiplier limited, trying to achieve max performance (2)

Popular Reviews

Jul 17th, 2025 Razer Blade 16 (2025) Review - Thin, Light, Punchy, and Efficient
Jul 14th, 2025 MSI GeForce RTX 5060 Gaming OC Review
Jul 18th, 2025 Thermal Grizzly WireView Pro Review
Jul 16th, 2025 Pulsar X2 Crazylight Review
Jul 15th, 2025 SilverStone SETA H2 Review
Jul 18th, 2025 AVerMedia Live Gamer Ultra S (GC553Pro) Review
May 13th, 2025 Upcoming Hardware Launches 2025 (Updated May 2025)
Jun 20th, 2025 Sapphire Radeon RX 9060 XT Pulse OC 16 GB Review - An Excellent Choice
Jul 4th, 2025 NVIDIA GeForce RTX 5050 8 GB Review
Nov 6th, 2024 AMD Ryzen 7 9800X3D Review - The Best Gaming Processor

TPU on YouTube

Controversial News Posts