• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA Triton Inference Server Running A100 Tensor Core GPUs Boosts Bing Advert Delivery

T0@st

News Editor
Joined
Mar 7, 2023
Messages
3,063 (3.88/day)
Location
South East, UK
System Name The TPU Typewriter
Processor AMD Ryzen 5 5600 (non-X)
Motherboard GIGABYTE B550M DS3H Micro ATX
Cooling DeepCool AS500
Memory Kingston Fury Renegade RGB 32 GB (2 x 16 GB) DDR4-3600 CL16
Video Card(s) PowerColor Radeon RX 7800 XT 16 GB Hellhound OC
Storage Samsung 980 Pro 1 TB M.2-2280 PCIe 4.0 X4 NVME SSD
Display(s) Lenovo Legion Y27q-20 27" QHD IPS monitor
Case GameMax Spark M-ATX (re-badged Jonsbo D30)
Audio Device(s) FiiO K7 Desktop DAC/Amp + Philips Fidelio X3 headphones, or ARTTI T10 Planar IEMs
Power Supply ADATA XPG CORE Reactor 650 W 80+ Gold ATX
Mouse Roccat Kone Pro Air
Keyboard Cooler Master MasterKeys Pro L
Software Windows 10 64-bit Home Edition
Inference software enables shift to NVIDIA A100 Tensor Core GPUs, delivering 7x throughput for the search giant. Jiusheng Chen's team just got accelerated. They're delivering personalized ads to users of Microsoft Bing with 7x throughput at reduced cost, thanks to NVIDIA Triton Inference Server running on NVIDIA A100 Tensor Core GPUs. It's an amazing achievement for the principal software engineering manager and his crew.

Tuning a Complex System
Bing's ad service uses hundreds of models that are constantly evolving. Each must respond to a request within as little as 10 milliseconds, about 10x faster than the blink of an eye. The latest speedup got its start with two innovations the team delivered to make AI models run faster: Bang and EL-Attention. Together, they apply sophisticated techniques to do more work in less time with less computer memory. Model training was based on Azure Machine Learning for efficiency.




Flying With NVIDIA A100 MIG
Next, the team upgraded the ad service from NVIDIA T4 to A100 GPUs. The latter's Multi-Instance GPU (MIG) feature lets users split one GPU into several instances. Chen's team maxed out the MIG feature, transforming one physical A100 into seven independent ones. That let the team reap a 7x throughput per GPU with inference response in 10 ms.

Flexible, Easy, Open Software
Triton enabled the shift, in part, because it lets users simultaneously run different runtime software, frameworks and AI modes on isolated instances of a single GPU. The inference software comes in a software container, so it's easy to deploy. And open-source Triton - also available with enterprise-grade security and support through NVIDIA AI Enterprise - is backed by a community that makes the software better over time.

Accelerating Bing's ad system with Triton on A100 GPUs is one example of what Chen likes about his job. He gets to witness breakthroughs with AI.

While the scenarios often change, the team's goal remains the same - creating a win for its users and advertisers.

View at TechPowerUp Main Site | Source
 
Oh yay and just when I was getting worried about the viability of advertising, in swoops our savior.
 
Try out the new & improved Bing!
Now serving you even more customized ads, seven times more efficiently :clap:
 
The only way AI can make Bing advertising better is a Skynet situation where Bing is the target.
 
Mankind creating most advanced chips to date only to use to serve ads better.

This world is doomed.

that is what it's all about. I look at the new apple AR ski googles and all I see is a more efficient ad delivery system. all silicon things are quickly becoming that, ad delivery platforms.
 
Mankind creating most advanced chips to date only to use to serve ads better.

This world is doomed.

If you have a fully functional AI system you can replace all the human workers. Profit because that's where less salary overall is paid out.

AI is going to kill alot of jobs. All the human interaction will be gone in the future. All led by AI models.
 
If you have a fully functional AI system you can replace all the human workers. Profit because that's where less salary overall is paid out.

AI is going to kill alot of jobs. All the human interaction will be gone in the future. All led by AI models.

Like steam power. Like electrification. Like air conditioning. Like computers. And so on...
 
Back
Top