• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

NVIDIA Discusses the Revenue-Generating Potential of AI Factories

T0@st

News Editor
Joined
Mar 7, 2023
Messages
3,173 (3.95/day)
Location
South East, UK
System Name The TPU Typewriter
Processor AMD Ryzen 5 5600 (non-X)
Motherboard GIGABYTE B550M DS3H Micro ATX
Cooling DeepCool AS500
Memory Kingston Fury Renegade RGB 32 GB (2 x 16 GB) DDR4-3600 CL16
Video Card(s) PowerColor Radeon RX 7800 XT 16 GB Hellhound OC
Storage Samsung 980 Pro 1 TB M.2-2280 PCIe 4.0 X4 NVME SSD
Display(s) Lenovo Legion Y27q-20 27" QHD IPS monitor
Case GameMax Spark M-ATX (re-badged Jonsbo D30)
Audio Device(s) FiiO K7 Desktop DAC/Amp + Philips Fidelio X3 headphones, or ARTTI T10 Planar IEMs
Power Supply ADATA XPG CORE Reactor 650 W 80+ Gold ATX
Mouse Roccat Kone Pro Air
Keyboard Cooler Master MasterKeys Pro L
Software Windows 10 64-bit Home Edition
AI is creating value for everyone—from researchers in drug discovery to quantitative analysts navigating financial market changes. The faster an AI system can produce tokens, a unit of data used to string together outputs, the greater its impact. That's why AI factories are key, providing the most efficient path from "time to first token" to "time to first value." AI factories are redefining the economics of modern infrastructure. They produce intelligence by transforming data into valuable outputs—whether tokens, predictions, images, proteins or other forms—at massive scale.

They help enhance three key aspects of the AI journey—data ingestion, model training and high-volume inference. AI factories are being built to generate tokens faster and more accurately, using three critical technology stacks: AI models, accelerated computing infrastructure and enterprise-grade software. Read on to learn how AI factories are helping enterprises and organizations around the world convert the most valuable digital commodity—data—into revenue potential.




From Inference Economics to Value Creation
Before building an AI factory, it's important to understand the economics of inference—how to balance costs, energy efficiency and an increasing demand for AI. Throughput refers to the volume of tokens that a model can produce. Latency is the amount of tokens that the model can output in a specific amount of time, which is often measured in time to first token—how long it takes before the first output appears—and time per output token, or how fast each additional token comes out. Goodput is a newer metric, measuring how much useful output a system can deliver while hitting key latency targets.


User experience is key for any software application, and the same goes for AI factories. High throughput means smarter AI, and lower latency ensures timely responses. When both of these measures are balanced properly, AI factories can provide engaging user experiences by quickly delivering helpful outputs. For example, an AI-powered customer service agent that responds in half a second is far more engaging and valuable than one that responds in five seconds, even if both ultimately generate the same number of tokens in the answer. Companies can take the opportunity to place competitive prices on their inference output, resulting in more revenue potential per token. Measuring and visualizing this balance can be difficult—which is where the concept of a Pareto frontier comes in.

AI Factory Output: The Value of Efficient Tokens
The Pareto frontier, represented in the figure below, helps visualize the most optimal ways to balance trade-offs between competing goals—like faster responses vs. serving more users simultaneously—when deploying AI at scale.



The vertical axis represents throughput efficiency, measured in tokens per second (TPS), for a given amount of energy used. The higher this number, the more requests an AI factory can handle concurrently. The horizontal axis represents the TPS for a single user, representing how long it takes for a model to give a user the first answer to a prompt. The higher the value, the better the expected user experience. Lower latency and faster response times are generally desirable for interactive applications like chatbots and real-time analysis tools.

The Pareto frontier's maximum value—shown as the top value of the curve—represents the best output for given sets of operating configurations. The goal is to find the optimal balance between throughput and user experience for different AI workloads and applications. The best AI factories use accelerated computing to increase tokens per watt—optimizing AI performance while dramatically increasing energy efficiency across AI factories and applications.We have tracked user experiences: when running on NVIDIA H100 GPUs configured to run at 32 tokens per second per user, versus NVIDIA B300 GPUs running at 344 tokens per second per user. At the configured user experience, Blackwell Ultra delivers over a 10x better experience and almost 5x higher throughput, enabling up to 50x higher revenue potential.

How an AI Factory Works in Practice
An AI factory is a system of components that come together to turn data into intelligence. It doesn't necessarily take the form of a high-end, on-premises data center, but could be an AI-dedicated cloud or hybrid model running on accelerated compute infrastructure. Or it could be a telecom infrastructure that can both optimize the network and perform inference at the edge. Any dedicated accelerated computing infrastructure paired with software turning data into intelligence through AI is, in practice, an AI factory.

The components include accelerated computing, networking, software, storage, systems, and tools and services. When a person prompts an AI system, the full stack of the AI factory goes to work. The factory tokenizes the prompt, turning data into small units of meaning—like fragments of images, sounds and words. Each token is put through a GPU-powered AI model, which performs compute-intensive reasoning on the AI model to generate the best response. Each GPU performs parallel processing—enabled by high-speed networking and interconnects—to crunch data simultaneously. An AI factory will run this process for different prompts from users across the globe. This is real-time inference, producing intelligence at industrial scale.



Because AI factories unify the full AI lifecycle, this system is continuously improving: inference is logged, edge cases are flagged for retraining and optimization loops tighten over time—all without manual intervention, an example of goodput in action. Leading global security technology company Lockheed Martin has built its own AI factory to support diverse uses across its business. Through its Lockheed Martin AI Center, the company centralized its generative AI workloads on the NVIDIA DGX SuperPOD to train and customize AI models, use the full power of specialized infrastructure and reduce the overhead costs of cloud environments.

"With our on-premises AI factory, we handle tokenization, training and deployment in house," said Greg Forrest, director of AI foundations at Lockheed Martin. "Our DGX SuperPOD helps us process over 1 billion tokens per week, enabling fine-tuning, retrieval-augmented generation or inference on our large language models. This solution avoids the escalating costs and significant limitations of fees based on token usage."

NVIDIA Full-Stack Technologies for AI Factory
An AI factory transforms AI from a series of isolated experiments into a scalable, repeatable and reliable engine for innovation and business value. NVIDIA provides all the components needed to build AI factories, including accelerated computing, high-performance GPUs, high-bandwidth networking and optimized software.

NVIDIA Blackwell GPUs, for example, can be connected via networking, liquid-cooled for energy efficiency and orchestrated with AI software.

The NVIDIA Dynamo open-source inference platform offers an operating system for AI factories. It's built to accelerate and scale AI with maximum efficiency and minimum cost. By intelligently routing, scheduling and optimizing inference requests, Dynamo ensures that every GPU cycle ensures full utilization, driving token production with peak performance.

NVIDIA Blackwell GB200 NVL72 systems and NVIDIA InfiniBand networking are tailored to maximize token throughput per watt, making the AI factory highly efficient from both total throughput and low latency perspectives.

By validating optimized, full-stack solutions, organizations can build and maintain cutting-edge AI systems efficiently. A full-stack AI factory supports enterprises in achieving operational excellence, enabling them to harness AI's potential faster and with greater confidence.

View at TechPowerUp Main Site | Source
 
AI will create even more value if and when competition from AMD and Intel and China GPU divisions start existing and push down GPU prices.
 
I think NVidia is trying to sell a solution for a problem that doesn't exist. Of course, there are some very legit uses for AI...
Well, if there is money involved, any money, of any kind whatsoever, then nGreediya is on it, in a friggin heartbeat (and will NEVER go away)
 
What's the corporate end game though? Replace all workers with Ai and robots. No human worker earns any money anymore, who will be buying all the iPhones and GeForce graphic cards then if no one can even earn any money? Everyone can't be "irreplaceable" execs. It's like they didn't think this "revolution" through. It's already bad as it is and we're not even where corporations want it...
 
I think NVidia is trying to sell a solution for a problem that doesn't exist. Of course, there are some very legit uses for AI...
Unfortunately, its going to cause more problems than it solves. Its already being used to manufacture deepfakes & other nonsensical material. Its going to turn into AI chasing AI. If there ever was a business to fuel its own demand, AI is probably going to be the king.

What's the corporate end game though? Replace all workers with Ai and robots. No human worker earns any money anymore, who will be buying all the iPhones and GeForce graphic cards then if no one can even earn any money? Everyone can't be "irreplaceable" execs. It's like they didn't think this "revolution" through. It's already bad as it is and we're not even where corporations want it...
Its no different than thinking it was a good idea to migrate just about everything from paper to data drives & such. Its the age where convenience outweighs prudence. We still haven't hardened our electric grid against CMEs & EMPs, probably one of the most important infrastructures of every modern country. If there's ever to be a time period that ends up completely disappearing from history, it will be the computer age.
AI won't ever replace humans. Instead, I think humans will come close to destroying itself with AI.
 
What's the corporate end game though? Replace all workers with Ai and robots. No human worker earns any money anymore, who will be buying all the iPhones and GeForce graphic cards then if no one can even earn any money? Everyone can't be "irreplaceable" execs. It's like they didn't think this "revolution" through. It's already bad as it is and we're not even where corporations want it...

The train has departed FA station and is accelerating toward FO. The chances of the whole world unanimously agreeing to stop (and abiding by that decision) are close to zero. The end game will likely land somewhere between dystopia and extinction. Sorry if that sounds pessimistic, but if anyone thinks we'll get the Utopia ending they need to consider how we could possibly arrive there without first moving toward it.
 
What's the corporate end game though? Replace all workers with Ai and robots. No human worker earns any money anymore, who will be buying all the iPhones and GeForce graphic cards then if no one can even earn any money? Everyone can't be "irreplaceable" execs. It's like they didn't think this "revolution" through. It's already bad as it is and we're not even where corporations want it...
If it gets to that, which current AI has no chance of doing, then money just becomes irrelevant. We won't need to pay for anything because there's no humans in production that need paying. It's circular: Money is a tool to get stuff made. If it all gets made without human input then there is no need to pay anyone and therefore no need for money.

But, given that scenario, the bigger question becomes what's there to stop megalomaniacs from repeatedly pressing the kill-everyone-I-dislike button. If there is no need for labor to build everything then why would everybody be kept around at all?
 
What god wants, god gets.
 
What's the corporate end game though? Replace all workers with Ai and robots. No human worker earns any money anymore, who will be buying all the iPhones and GeForce graphic cards then if no one can even earn any money? Everyone can't be "irreplaceable" execs. It's like they didn't think this "revolution" through. It's already bad as it is and we're not even where corporations want it...
If AI earns all the money for me not having to work and to have 24 hours a day for my hobbies, then I'm all in. Time to do away with the society where everything's value is measured in either dollars or working hours.

Not that I think Nvidia's way is necessarily the way (AI is still just a fancy buzzword for LLM), but I'll leave that for the experts to decide.
 
If AI earns all the money for me not having to work and to have 24 hours a day for my hobbies, then I'm all in. Time to do away with the society where everything's value is measured in either dollars or working hours.

Not that I think Nvidia's way is necessarily the way (AI is still just a fancy buzzword for LLM), but I'll leave that for the experts to decide.

I think the philosophical argument is that aren't we, me, you, and him, and her, just LLM's too? Every thought and interaction I have with the world is due to language. Passionate kiss? Thoughts of words still are why that exists and happens. So, who is to say an LLM is not sentient?
 
I think the philosophical argument is that aren't we, me, you, and him, and her, just LLM's too? Every thought and interaction I have with the world is due to language. Passionate kiss? Thoughts of words still are why that exists and happens. So, who is to say an LLM is not sentient?
Mixing words in a special way to make them (appear like they) convey meaning and actually understanding the object of meaning are entirely different things. If this wasn't the case, then animals and toddlers would have no means to learn and understand the world around them, as they do not possess any language skills.

You have feelings, sensations, and memories of them. Sure, you associate them with language elements in your mind, but you wouldn't be able to do that if you didn't have those memories in the first place.

I think we're straying from the topic, though. :)
 
What's the corporate end game though? Replace all workers with Ai and robots. No human worker earns any money anymore, who will be buying all the iPhones and GeForce graphic cards then if no one can even earn any money? Everyone can't be "irreplaceable" execs. It's like they didn't think this "revolution" through. It's already bad as it is and we're not even where corporations want it...
IMG_0180.jpeg


I don’t think anyone has even contemplated what wide AI adoption, replacing vast numbers of workers will do to the society. And by the fact that tech company leaders are now paraded as the pillars of our society, I bet all the downsides will be swept under the rug for as long as they will be able to.
 
Nvidia discusses revenue generation for Nvidia.
 
If AI earns all the money for me not having to work and to have 24 hours a day for my hobbies, then I'm all in.
The funny thing with technological advancements is that the original plot was to make our lives easier., to the point where we indeed didnt had to work.

The only problem with that is that we also created inflation to keep us working to get more, since the money keeps getting devaluated.

I hope that we reach the Star Treck/The Orville timeline (before destroying ourselves) where material possessions are no longer our priority.
 
The king of AI trying to sell you AI nothing dubious about that.
 
I think NVidia is trying to sell a solution for a problem that doesn't exist. Of course, there are some very legit uses for AI...

AI in general is that. The good use cases are niche. Someone had a theory that basically said Big Tech doesn't really have a Next Big Thing, and endless growth demands constant new things. That is why AI is being forced everywhere. Plus at this point there is massive investments made into hardware but no obvious way to profit other than "everyone on earth has to use it" but why people would want to ... is not clear. Hence the forcing.

If AI earns all the money for me not having to work and to have 24 hours a day for my hobbies, then I'm all in. Time to do away with the society where everything's value is measured in either dollars or working hours.

You don't get it. You would be out of a job and hence have no money. Remember Sam Altman saying "we may have to rewrite the social contract"? The money goes to the execs, not people like you.
I think the philosophical argument is that aren't we, me, you, and him, and her, just LLM's too? Every thought and interaction I have with the world is due to language. Passionate kiss? Thoughts of words still are why that exists and happens. So, who is to say an LLM is not sentient?

You think in words? Weirdo. Also, no. Absolutely not.
 
The only reason for AI is to sell more Nvidia video cards… works out really well for Nvidia…
(ask google “ should I use vibe programming in a job interview?”… ) (nothing to do with this topic but IMO, vibe programming is a cheat…)
 
When is sexy android coming out guys :laugh:
 
Back
Top