Qualcomm's Success with Windows AI PC Drawing NVIDIA Back to the Client SoC Business

Vayra86 · May 23, 2024

Eternit said:
There is a big assumption there will be Qualcomm's success with Windows AI PC. For now it is just a big hype and no one knows how many units will be sold and even if it will be a success for this generation, if it will continue with the future generations.

Well if they start peddling 450 dollar keyboards it won't go places fast lol

hsew said:
CPU performance isn’t everything you know? The SOC also has to be affordable and practical. Just because it doesn’t have 16 Zen 4 P-cores and a 500W GPU doesn’t automatically make it crap…

For?

We already have a whole landscape of affordable and practical CPUs in a wide variety of ways. If Nvidia does off the shelf cores, how will they differentiate? Probably with a segment of Nvidia sauce. So they can present some USP on their AI capability. Its going to be a massive bunch of bullshit. Much like RT there's going to be a black box 'NPU performance' to compare between constantly moving models and locally updated tools? Its going to fail spectacularly.

Eternit · May 23, 2024

Anyway. ChatGPT is now dead and so is Copilot and many services using Bing/ChatGPT API.

human_error · May 23, 2024

Nvidia could always make an NPU PCIE add-in card/chip that Dell or whoever else could add to their products. Would be decoupled from the GPU so could work on an Intel or AMD or ARM based system, and be substantially faster than the NPUs on the CPU package (or even let Dell use non-NPU CPU packages).

hsew · May 24, 2024

human_error said:
Nvidia could always make an NPU PCIE add-in card/chip that Dell or whoever else could add to their products. Would be decoupled from the GPU so could work on an Intel or AMD or ARM based system, and be substantially faster than the NPUs on the CPU package (or even let Dell use non-NPU CPU packages).

If Nvidia did this it would 1000% be a datacenter only product.

human_error · May 24, 2024

hsew said:
If Nvidia did this it would 1000% be a datacenter only product.

No need for basic NPUs in data centers - that's where they'll sell you a GPU farm. This could be low power, low cost, high volume part for laptops/handhelds. If they make the NPU API accessible and it matched their GPU NPU APIs could be a way of building a new ecosystem for them, similar to CUDA.

hsew · May 24, 2024

human_error said:
No need for basic NPUs in data centers - that's where they'll sell you a GPU farm. This could be low power, low cost, high volume part for laptops/handhelds. If they make the NPU API accessible and it matched their GPU NPU APIs could be a way of building a new ecosystem for them, similar to CUDA.

Nah, the consumer space already knows Nvidia as the gaming graphics company (GeForce/Now), whereas the datacenter knows them for compute/cuda.

Besides, NPUs are only breaking into the client space as an SOC/GPU integration (keyword). Given that 99.9999% of today’s client devices are primarily SOC with optional dGPU, selling yet another discrete, dis-integrated component to a consumer is just not a bet I’d make as Nvidia.

In short, not saying Nvidia would do a discrete NPU when they have GPUs, but if they did, it would almost certainly NOT be a consumer product.

R0H1T · May 24, 2024

human_error said:
low cost, high volume part for laptops/handhelds.

When was the last time you saw that from Green goblins? Switch doesn't count.

ikjadoon · May 25, 2024

Darmok N Jalad said:
Won't things like GPUs need to have ARM-specific drivers? We have one ARM-based desktop with standard PCIe expansion slots that I know of, the Mac Pro. Unlike the x86 Mac Pro that it replaced, it doesn't support standard GPUs, and many other kinds of PCIe cards are not compatible on the ARM-Mac versus the x86 Mac. I don't know the ins-and-outs of hardware level drivers, but wouldn't WOA desktops have a similar problem?

And yeah, I don't know that NVIDIA needs to go full-custom. They could pull the architecture off the shelf and probably get more out of it by using advanced nodes like Apple does. It sure seems like they could easily answer Snapdragon if they wanted to, and now there's a window of opportunity for such devices. It makes me wonder if MS hasn't already asked NVIDIA, and NVIDIA wasn't interested. Or maybe MS didn't want to deal with NVIDIA, I dunno.

Exactly: NVIDIA will need to produce / develop / test / ship ARM64 WoA drivers for their GPUs, which they have never done. Presumably, if they are making "AI PCs" as Jensen alludes to, they'll need to port their drivers to WoA.

Apps can be emulated, but drivers really do need to be native. NVIDIA has GPU drivers for Linux on Arm, but not Windows on Arm.

Many Arm-based systems have PCIe (e.g., datacenter), so it's not a hardware limitation (e.g., PCIe is much more abstracted vs the CPU ISA). The Ampere Altra desktop is also Arm-based with PCIe expansion. Interestingly, this may the system Linus Torvalds now uses.

//

R0H1T said:
First of all that's just an estimate, it's also missing FP numbers so barely half the story.

Meanwhile in the real world we have ~
View attachment 348510

NVIDIA GH200 72 Core Grace CPU Performance vs. AMD Ryzen Threadripper Workstations - Phoronix

www.phoronix.com

GPTshop.ai NVIDIA GH200 Linux Benchmarks Performance - OpenBenchmarking.org

openbenchmarking.org

It's easy to forget how bandwidth starved regular zen4 chips are, I think I saw that analysis on Chips & Cheese. With more memory channels &/or higher speed memory they easily pull way past Grace Hopper & Emerald (Sapphire?) Rapids as well. This is why Strix point & Halo would be interesting to watch & whether AMD can at least feed zen5 better on desktops/mobile platforms!

It's easy to forget that most SPEC testing is an "estimate".

We shouldn't worry: plenty non-SPEC benchmarks are far less reliable than a well-done SPEC estimate. Very few people submit their benchmark + methodology for independent validation for a validated SPEC score.

You seem to not understand the actual parameters of "the real world": first, Grace uses Cortex-X3-based (Neoverse V2) cores, so this comparison is moot: NVIDIA is rumored to use the Cortex-X5. Second, much of Phoronix's testing is heavily nT, so the significantly-higher-core-count 7995WX (96-cores) is also rather irrelevant, especially with the next point. Third, the 7980X and 7995WX have 350W TDPs (and consume about that); without actual & comparable data on the GH200, this is not an interesting comparison when power draw is a key limiting factor in consumer SoCs. Fourth, Phoronix notes many times some of their Linux benchmarks in these tests weren't optimized for AArch64 yet, so it is not much to stand on.

In the end, it's a nonsense comparison: this rumor isn't saying NVIDIA isn't trying to replace Zen4 workstations with GH200. NVIDIA is claimed to be making consumer APUs for Windows on Arm. Linux perf, enterprise workloads, developer workloads, scientific workloads, 300W+ TDP perf, nT performance beyond 8-12 cores: all irrelevant here. SPEC was a much better estimate, even with only int, IMO.

But, if we want to measure a current Arm uArches vs Zen4 on 100% native code, fp & int, phones vs desktops, etc. Geekbench is the last man standing. The Cortex-X4 does fine and it's more than enough for Windows on Arm & consumer workloads, even if it's a generation behind what NVIDIA will ship: it is only available on phones, so you won't get much reliable cross-platform data.

1T Cortex-X4 smartphone: 2,287 pts - 100%
1T 7995WX workstation: 2,720 pts (or 2,702 pts) - 119%

It's a good thing AMD uses Geekbench for CPU perf estimates on consumer workloads, so I can happily avoid all the usual disclaimers. We'll have to see how Cortex-X5 lands, but I don't think NVIDIA's value prop. depends on "fastest 1T CPU ever for WoA": it just needs to be good enough versus Intel & AMD in 2025.

//

TL;DR: We were discussing uArches for a future NVIDIA SoC on Windows for consumers, which Phoronix is miles away from capturing.

R0H1T · May 25, 2024

ikjadoon said:
Grace uses Cortex-X3-based (Neoverse V2) cores, so this comparison is moot: NVIDIA is rumored to use the Cortex-X5.

Right but as it stands now zen4 is way ahead in those workloads. I wouldn't put it past the realms of possibility that Nvidia can catch up but they'll also be competing against zen5 at the time, it's a moving target.

ikjadoon said:
Phoronix's testing is heavily nT, so the significantly-higher-core-count 7995WX (96-cores) is also rather irrelevant

I'm not really bothered by that since both AMD & Intel use basically "factory OC" to get great ST numbers ~ which of course makes their efficiency look bad, like in that Phoronix test. For any consumer platform Nvidia will not only have to do something about the opposition's massive clock advantage but also their massive core advantage as well. I'm willing to bet they won't sell 12-16 cores cheaper than AMD at the start.

ikjadoon said:
Third, the 7980X and 7995WX have 350W TDPs (and consume about that); without actual & comparable data on the GH200, this is not an interesting comparison when power draw is a key limiting factor in consumer SoCs.

Check my last point about OC. AMD's most "efficient" chips right now are either zen4c or have x3d cache.

ikjadoon said:
Fourth, Phoronix notes many times some of their Linux benchmarks in these tests weren't optimized for AArch64 yet, so it is not much to stand on.

The workloads can be optimized further on Intel & AMD so it's a two way street, although for ARM gains should generally be higher.

Random_User · May 25, 2024

TristanX said:
If NV join PC CPU pack, than Intel and AMD are in serious problems

Looks like it. They've tried with CPUs before. And since they couldn't get x86 licence, they've saw the potential in Qualcomm's accomplishments, and might want to give another try. So nothing prevents nVidia from entering the desktop and mobile CPU race, and steadily grow their share in this market. Especially with their gazzillion, they've got with the AI surge. Making the products that suit this trend, helps them even more. A CPU is a CPU, after all. There's no law, that says it should be x86 only.

This seems to be serious, even if it doesn't appers to be so for now. nVidia might want to outplay both Intel and AMD, at their of duopoly game. Time will tell.

Bwaze said:
But do they really want to? The market is:

"Windows AI PCs powered by the likes of Qualcomm, Intel, and AMD, who each sell 15 W-class processors with integrated NPUs capable of 50 AI TOPS"

But Nvidia clearly outgrew catering to lowly penny pinching peasants - they practically don't offer low end GPUs, and with every generation they delay their lower end offerings more and more. And we can understand why - their server, AI mainframes are what's driving the stellar growth, not home users. Why would all of a sudden they want to deal with market that requires low margins and vast volume?

This might happen not for a volume first. Knowing nVidia, they will put strong marketing at their products, so the reputation precedes, creating the steady ground for their establishment. That's how nVidia have beaten Radeon with clearly inferior products, back in the day.

They might create the image, even without the actual product stock at first, so that might shift the attention from x86 desktop dominance. Even if this is going to be flop, this might shake the current desktop situation, as nVidia now controls the flow in the industry, and mindshare. Who cares if the actual product is shit, if the brand's recognition and stock price is spilling over the edge.

hsew said:
Umm no. Tegra powers the Nintendo Switch (and likely Switch 2). Selling 100M+ units is not a failure…

Indeed. This might actually be the most probable begining. As nVidia's ARM seems to be not powerful enough for desktop yet. The portable/mobile/handheld, might be the best start. And everything ARM can't handle, may be achieved by nVidia GPU's proprietary compute power. Of course if they'd manange to scale it to the portable form factor. And if it will accuire enough success in handheld, with experience gained, this might transfer to the desktop as well. And then...

Eternit said:
There is a big assumption there will be Qualcomm's success with Windows AI PC. For now it is just a big hype and no one knows how many units will be sold and even if it will be a success for this generation, if it will continue with the future generations.

I think this is obvious direction of MS Windows. They made it looks as a mobile OS, and run of mobile as well. They got huge repository of Linux oriented stuff, and they even made WIndows a bit compatible with Linux, as it's the core of most ARM based OSs. Why does it matter? Because MS goes after user data, and everyone be dependant on their cloud services. The best way to do it, is to move everyone to the client mobile device. And as x86 did some moves into the portable market, the ARM seems to get success at tesktop much sooner. Especially if Windows will get native support for ARM CPUs.

londiste said:
Switch is also from 2017. With a SoC from 2015

They don't need to. They are relying on ARM for CPU cores for now.
Also, Denver was pretty good back when Nvidia was trying to cook their own. Pretty sure they have the know-how.
Tegra did not fail spectacularly, it kind of slid out of our view. They pivoted from consumer stuff to automotive and industrial. Most likely due to profit margins.

Indeed. If that's out of view, it doesn't mean to be abandoned. nVidia is after data center/cloud business. They are both, HW maker for these, and also the provider of cloud services. And having portable client device, locked ("certified for best experience") into their GeForce Now infrastructure, seems logical. And it doesn't have to be powerful desktop, either. Much like many tasks run on it, do not require huge powerhouse. An office/entertainment PC can be basically ran on ARM. Only heavy workloads and gaming require it.

System Name	Tiny the White Yeti
Processor	7800X3D
Motherboard	MSI MAG Mortar b650m wifi
Cooling	CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory	32GB Corsair Vengeance 30CL6000
Video Card(s)	ASRock RX7900XT Phantom Gaming
Storage	Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s)	Gigabyte G34QWC (3440x1440)
Case	Lian Li A3 mATX White
Audio Device(s)	Harman Kardon AVR137 + 2.1
Power Supply	EVGA Supernova G2 750W
Mouse	Steelseries Aerox 5
Keyboard	Lenovo Thinkpad Trackpoint II
VR HMD	HD 420 - Green Edition ;)
Software	W11 IoT Enterprise LTSC
Benchmark Scores	Over 9000

Processor	AMD 7800X3D (8C 16T)
Motherboard	MSI X670E Tomahawk
Cooling	Corsair H100i Elite
Memory	32GB Corsair 6000mhz
Video Card(s)	Powercolor RX 6900 XT Red Devil Ultimate (XTXH) @ 2.6ghz core, 2.1ghz mem
Storage	1TB WD SN770 NVME, 6TB across various SSDs/NVMEs, 4TB HDD
Display(s)	Asus 32" PG32QUX (4k 144hz mini-LED backlit IPS with freesync & gsync & 1400 nit HDR)
Case	Corsair 760T
Power Supply	Corsair HX1500i
Mouse	Logitech G502 Lightspeed on powerplay mousemat
Keyboard	Razr Huntsman V2 analog
VR HMD	Wireless Vive Pro & Valve knuckles
Software	Windows 10 Pro

Processor	AMD 7800X3D (8C 16T)
Motherboard	MSI X670E Tomahawk
Cooling	Corsair H100i Elite
Memory	32GB Corsair 6000mhz
Video Card(s)	Powercolor RX 6900 XT Red Devil Ultimate (XTXH) @ 2.6ghz core, 2.1ghz mem
Storage	1TB WD SN770 NVME, 6TB across various SSDs/NVMEs, 4TB HDD
Display(s)	Asus 32" PG32QUX (4k 144hz mini-LED backlit IPS with freesync & gsync & 1400 nit HDR)
Case	Corsair 760T
Power Supply	Corsair HX1500i
Mouse	Logitech G502 Lightspeed on powerplay mousemat
Keyboard	Razr Huntsman V2 analog
VR HMD	Wireless Vive Pro & Valve knuckles
Software	Windows 10 Pro

System Name	Very old, but all I've got ®
Processor	So old, you don't wanna know... Really!