Microsoft Copilot to Run Locally on AI PCs with at Least 40 TOPS of NPU Performance

btarunr · Mar 28, 2024

Microsoft, Intel, and AMD are attempting to jumpstart demand in the PC industry again, under the aegis of the AI PC—devices with native acceleration for AI workloads. Both Intel and AMD have mobile processors with on-silicon NPUs (neural processing units), which are designed to accelerate the first wave of AI-enhanced client experiences on Windows 11 23H2. Microsoft's bulwark with democratizing AI has been Copilot, as a licensee of Open AI GPT-4, GPT-4 Turbo, Dali, and other generative AI tools from the Open AI stable. Copilot is currently Microsoft's most heavily invested application, with its most capital and best minds mobilized to making it the most popular AI assistant. Microsoft even pushed for the AI PC designator to PC OEMs, which requires them to have a dedicated Copilot key akin to the Start key (we'll see how anti-competition regulators deal with that).

The problem with Microsoft's tango with Intel and AMD to push AI PCs, is that Copilot doesn't really use an NPU, not even at the edge—you input a query or a prompt, and Copilot hands it over to a cloud-based AI service. This is about to change, with Microsoft announcing that Copilot will be able to run locally on AI PCs. Microsoft identified several kinds of Copilot use-cases that an NPU can handle on-device, which should speed up response times to Copilot queries, but this requires the NPU to have at least 40 TOPS of performance. This is a problem for the current crop of processors with NPUs. Intel's Core Ultra "Meteor Lake" has an AI Boost NPU with 10 TOPS on tap, while the Ryzen 8040 "Hawk Point" is only slightly faster, with a 16 TOPS Ryzen AI NPU. AMD has already revealed that the XDNA 2-based 2nd Generation Ryzen AI NPU in its upcoming "Strix Point" processors will come with over 40 TOPS of performance, and it stands to reason that the NPUs in Intel's "Arrow Lake" or "Lunar Lake" processors are comparable in performance; which should enable on-device Copilot.

View at TechPowerUp Main Site | Source

dir_d · Mar 28, 2024

Why the hell would anyone buy a separate AI Microsoft license on a laptop to run their own AI when you can use so many online. As far as i know only the new Laptop SOCs from Intel and AMD carry NPUs.

MrDweezil · Mar 28, 2024

dir_d said:
Why the hell would anyone buy a separate AI Microsoft license on a laptop to run their own AI when you can use so many online. As far as i know only the new Laptop SOCs from Intel and AMD carry NPUs.

Response time? Not sending your data away?

watzupken · Mar 28, 2024

MrDweezil said:
Response time? Not sending your data away?

I doubt you are not sending data away. The app may be local, but it’s almost guaranteed that it will send data back to MS or OpenAI for more learning.
Having spent so much money on AI hardware, they now need to try and monetize it. So I won’t be surprise when big firms start to charge for use, others will eventually follow along with some sort of subscription model.

ThrashZone · Mar 28, 2024

Hi,
Just another name for telemetry and narrative training hehe
Glad I'm done with upgrading !

enb141 · Mar 28, 2024

Now the question is if customer grade NPU add on cards will enter the market.

Broken Processor · Mar 28, 2024

This is just more bloat I have to remove from Windows. I only use windows for gaming everything else is done on Linux. I can't wait for the day I can ditch this turd.

Bwaze · Mar 28, 2024

enb141 said:
Now the question is if customer grade NPU add on cards will enter the market.

Yoy can already accelerate AI learning at home with (strong) GPUs. For Stable Diffusion they reccomend RTX 4080 or 4090.

Lewzke · Mar 28, 2024

But the local network is much smaller than the online network inference, this is only useful when you have no internet connection but that is just surreal. I smell marketing here ...
When the PC can run the full size GPT inference then the sides can turn ... but this way is just some good sounding stickers in the laptop.

ThrashZone · Mar 28, 2024

Bwaze said:
Yoy can already accelerate AI learning at home with (strong) GPUs. For Stable Diffusion they reccomend RTX 4080 or 4090.

Hi,
Yeah who would of thunk AI would bring security holes hehe

NVIDIA Issues Patches for ChatRTX AI Chatbot, Suspect to Improper Privilege Management

Just a month after releasing the 0.1 beta preview of Chat with RTX, now called ChatRTX, NVIDIA has swiftly addressed critical security vulnerabilities discovered in its cutting-edge AI chatbot. The chatbot was found to be susceptible to cross-site scripting attacks (CWE-79) and improper...

www.techpowerup.com

Vayra86 · Mar 28, 2024

'Run locally' is a desperate attempt it seems to avoid the Bitcoin-pitfall of 'look at the energy cost of a transaction'. Now you've distributed that energy cost to the user and hidden it between all other usage and all is well.

ThrashZone · Mar 28, 2024

Hi,
To myself anyway the term run locally seems more like a massive local AI search database installed hehe

dir_d · Mar 28, 2024

After all the replies so far i still don't see why anyone would buy a separate AI license for windows to run this on a laptop. I could see a workstation computer but for a laptop I'm just stumped.

ThrashZone · Mar 28, 2024

dir_d said:
After all the replies so far i still don't see why anyone would buy a separate AI license for windows to run this on a laptop. I could see a workstation computer but for a laptop I'm just stumped.

Hi,
With most things now days buying is not an option only yearly subscriptions allowed.

dir_d · Mar 28, 2024

ThrashZone said:
Hi,
With most things now days buying is not an option only yearly subscriptions allowed.

Fair enough but who is this for?

ThrashZone · Mar 28, 2024

dir_d said:
Fair enough but who is this for?

Hi,
For people naive enough to think AI isn't web search results hehe

Darmok N Jalad · Mar 28, 2024

dir_d said:
Why the hell would anyone buy a separate AI Microsoft license on a laptop to run their own AI when you can use so many online. As far as i know only the new Laptop SOCs from Intel and AMD carry NPUs.

Apple claims that all their AI stuff is handled on-system through their Neural Engine. The rub, of course, is that Siri is way behind other AIs (which lends some credence to their claim, as more telemetry would help make Siri better), but for things like image processing, the NE does reasonably well. No doubt MS is wanting to give the user local AI as at least a competitive bullet point. It could also be preparing for future mandates when governments get involved in AI and data privacy legislation, which we all know will happen.

watzupken said:
I doubt you are not sending data away. The app may be local, but it’s almost guaranteed that it will send data back to MS or OpenAI for more learning.
Having spent so much money on AI hardware, they now need to try and monetize it. So I won’t be surprise when big firms start to charge for use, others will eventually follow along with some sort of subscription model.

Yes, with as much telemetry that MS collects on all its products, I suspect that the processing will be done locally, but telemetry will be sent back “to help improve the service.” Someday they may have to lock it down due to legislation, but not before they get tons of data back first.

enb141 · Mar 28, 2024

Bwaze said:
Yoy can already accelerate AI learning at home with (strong) GPUs. For Stable Diffusion they reccomend RTX 4080 or 4090.

Microsoft said that they won't use GPU for this tasks.. In this case NPU is needed because they want to free the GPU for other tasks and also because GPU consume lots of power in comparison to NPU.

Noyand · Mar 28, 2024

Vayra86 said:
'Run locally' is a desperate attempt it seems to avoid the Bitcoin-pitfall of 'look at the energy cost of a transaction'. Now you've distributed that energy cost to the user and hidden it between all other usage and all is well.

It will depend on how the user usage will evolve TBH. If copilot is mainly being used as a web search engine, most of the processing will be cloud based. My guess is that the local processing will involve stuff like text correction, text recognition on a picture, voice recognition, maybe image generation and so on. The kind of thing that even an iPhone 11 process locally. Virtual assistant like Siri, Google Assistant also became partially "Running on device", the PC is just following the same trend. A lot of the stuff that happening now are stuff that phones and tablet already achieved years ago.

In that context, I find the comparison to be misplaced. The NPU is not meant to be a big ML/ AI work horse eating watts 24/24 like no tomorrow, but a low power silicon used for stuff that really doesn't require that much processing power (Or to boost the GPU in some cases). A GPU is faster, but also use more power.

ThrashZone · Mar 28, 2024

Hi,
Hell I thought most were against mining saying it's a waste of power blah...
AI is mining to and using your resources to do it and wants you to pay a subscription for the pleasure

AI mines user activities and web data so where is the outrage hehe :slap:

Noyand · Mar 28, 2024

ThrashZone said:
Hi,
Hell I thought most were against mining saying it's a waste of power blah...
AI is mining to and using your resources to do it and wants you to pay a subscription for the pleasure

AI mines user activities and web data so where is the outrage hehe

There's a distinction between the AI ran by end-users, and the AI running in the data center, and it also depends on what the A.I. is trying to achieve. When apps talk about A.I., they talk about a simplified model that's been figured out by the massive data center. Our computers don't really do much training, besides simple stuff like speech patterns for autocorrect.

The quality of the data that the A.I. use to be trained on is too valuable, from what I read, controlled, and curated content is favoured. A.I. training on broad user data has been done before, it didn't end well

(Microsoft trained an A.I. with Twitter users' data, and it became a real piece of shit).

They have better ways to exploit us: when you solve a captcha, you contribute to train an A.I. it's digital labour and that's the currency being used for the "free" stuff (which isn't really free).

ML/A.I don't gather as much complains as mining because on some application the end-user does benefits from it. Stuff like real time translation, subject detection, better autofocus in cameras etc...

Makaveli · Mar 28, 2024

enb141 said:
Microsoft said that they won't use GPU for this tasks.. In this case NPU is needed because they want to free the GPU for other tasks and also because GPU consume lots of power in comparison to NPU.

And for a laptop that makes sense. However a desktop with a powerful gpu and power won't have those issues so lets hope they do indeed release it for gpu's also at some point.

Darmok N Jalad · Mar 28, 2024

Noyand said:
A.I. training on broad user data has been done before, it didn't end well (Microsoft trained an A.I. with Twitter users' data, and it became a real piece of shit).

You leave Tay out of this!

ThrashZone · Mar 28, 2024

Noyand said:
There's a distinction between the AI ran by end-users, and the AI running in the data center, and it also depends on what the A.I. is trying to achieve. When apps talk about A.I., they talk about a simplified model that's been figured out by the massive data center. Our computers don't really do much training, besides simple stuff like speech patterns for autocorrect.

The quality of the data that the A.I. use to be trained on is too valuable, from what I read, controlled, and curated content is favoured. A.I. training on broad user data has been done before, it didn't end well (Microsoft trained an A.I. with Twitter users' data, and it became a real piece of shit).

They have better ways to exploit us: when you solve a captcha, you contribute to train an A.I. it's digital labour and that's the currency being used for the "free" stuff (which isn't really free).

ML/A.I don't gather as much complains as mining because on some application the end-user does benefits from it. Stuff like real time translation, subject detection, better autofocus in cameras etc...

Hi,
People that mine actually make money although not much daily it does add up enough to pay for equipment.. eventually
As far as captcha goes there's an app to get around that nonsense did AI make it lol :laugh:

I was just pointing out the irony that AI does mine and frankly lots of people will reject it just as you point out the manipulated shit show already attempted :cool:

Carillon · Mar 29, 2024

Usually new tech comes out as hardware first, and a few decades later we start seeing the first programs to use it. Why is this new tech out as software first, with no hardware available?

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	4k
Processor	AMD 5800x3D
Motherboard	MSI MAG b550m Mortar Wifi
Cooling	ARCTIC Liquid Freezer II 240
Memory	4x8Gb Crucial Ballistix 3600 CL16 bl8g36c16u4b.m8fe1
Video Card(s)	Nvidia Reference 3080Ti
Storage	ADATA XPG SX8200 Pro 1TB
Display(s)	LG 48" C1
Case	CORSAIR Carbide AIR 240 Micro-ATX
Audio Device(s)	Asus Xonar STX
Power Supply	EVGA SuperNOVA 650W
Software	Microsoft Windows10 Pro x64

System Name	Ghetto Rigs z490\|x99\|Acer 17 Nitro 7840hs/ 5600c40-2x16/ 4060/ 1tb acer stock m.2/ 4tb sn850x
Processor	10900k w/Optimus Foundation \| 5930k w/Black Noctua D15
Motherboard	z490 Maximus XII Apex \| x99 Sabertooth
Cooling	oCool D5 res-combo/280 GTX/ Optimus Foundation/ gpu water block \| Blk D15
Memory	Trident-Z Royal 4000c16 2x16gb \| Trident-Z 3200c14 4x8gb
Video Card(s)	Titan Xp-water \| evga 980ti gaming-w/ air
Storage	970evo+500gb & sn850x 4tb \| 860 pro 256gb \| Acer m.2 1tb/ sn850x 4tb\| Many2.5" sata's ssd 3.5hdd's
Display(s)	1-AOC G2460PG 24"G-Sync 144Hz/ 2nd 1-ASUS VG248QE 24"/ 3rd LG 43" series
Case	D450 \| Cherry Entertainment center on Test bench
Audio Device(s)	Built in Realtek x2 with 2-Insignia 2.0 sound bars & 1-LG sound bar
Power Supply	EVGA 1000P2 with APC AX1500 \| 850P2 with CyberPower-GX1325U
Mouse	Redragon 901 Perdition x3
Keyboard	G710+x3
Software	Win-7 pro x3 and win-10 & 11pro x3
Benchmark Scores	Are in the benchmark section

System Name	Ghetto Rigs z490\|x99\|Acer 17 Nitro 7840hs/ 5600c40-2x16/ 4060/ 1tb acer stock m.2/ 4tb sn850x
Processor	10900k w/Optimus Foundation \| 5930k w/Black Noctua D15
Motherboard	z490 Maximus XII Apex \| x99 Sabertooth
Cooling	oCool D5 res-combo/280 GTX/ Optimus Foundation/ gpu water block \| Blk D15
Memory	Trident-Z Royal 4000c16 2x16gb \| Trident-Z 3200c14 4x8gb
Video Card(s)	Titan Xp-water \| evga 980ti gaming-w/ air
Storage	970evo+500gb & sn850x 4tb \| 860 pro 256gb \| Acer m.2 1tb/ sn850x 4tb\| Many2.5" sata's ssd 3.5hdd's
Display(s)	1-AOC G2460PG 24"G-Sync 144Hz/ 2nd 1-ASUS VG248QE 24"/ 3rd LG 43" series
Case	D450 \| Cherry Entertainment center on Test bench
Audio Device(s)	Built in Realtek x2 with 2-Insignia 2.0 sound bars & 1-LG sound bar
Power Supply	EVGA 1000P2 with APC AX1500 \| 850P2 with CyberPower-GX1325U
Mouse	Redragon 901 Perdition x3
Keyboard	G710+x3
Software	Win-7 pro x3 and win-10 & 11pro x3
Benchmark Scores	Are in the benchmark section

System Name	Tiny the White Yeti
Processor	7800X3D
Motherboard	MSI MAG Mortar b650m wifi
Cooling	CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory	32GB Corsair Vengeance 30CL6000
Video Card(s)	ASRock RX7900XT Phantom Gaming
Storage	Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s)	Gigabyte G34QWC (3440x1440)
Case	Lian Li A3 mATX White
Audio Device(s)	Harman Kardon AVR137 + 2.1
Power Supply	EVGA Supernova G2 750W
Mouse	Steelseries Aerox 5
Keyboard	Lenovo Thinkpad Trackpoint II
VR HMD	HD 420 - Green Edition ;)
Software	W11 IoT Enterprise LTSC
Benchmark Scores	Over 9000

System Name	Mac mini
Processor	Apple M1 8C
Motherboard	Mac mini logic board
Cooling	Mac mini cooler
Memory	16GB
Video Card(s)	M1 GPU
Storage	512GB
Display(s)	ASUS Pro Art 27"
Case	Mac mini enclosure
Power Supply	Apple 150W

System Name	The Expanse
Processor	AMD Ryzen 7 9800X3D
Motherboard	Asus Prime X670E-Pro Wifi BIOS 3222 AGESA PI 1.2.0.3a
Cooling	Corsair H150i Elite LCD XT
Memory	64GB G.SKILL Trident Z5 Neo RGB DDR5 6000 CL 30-40-40-96 1T
Video Card(s)	XFX Radeon RX 7900 XTX Magnetic Air (25.3.1)
Storage	WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 20TB
Display(s)	LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case	Fractal Design Meshify S2
Audio Device(s)	Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply	Corsair AX850 Titanium
Mouse	Corsair Dark Core RGB SE
Keyboard	Corsair K100
Software	Windows 10 Pro x64 22H2
Benchmark Scores	https://valid.x86.fr/0412jp https://browser.geekbench.com/v6/cpu/11073923

Microsoft Copilot to Run Locally on AI PCs with at Least 40 TOPS of NPU Performance

Editor & Senior Moderator