• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Intel Achieves First, Only Full NPU Support in MLPerf Client v0.6 Benchmark

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,769 (7.42/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Intel today announced that it is the only company to achieve full neural processing unit (NPU) support in the newly released MLPerf Client v0.6 benchmark. The result marks the industry's first standardized evaluation of large language model (LLM) performance on client NPUs. Intel's measurements of MLPerf Client v0.6 show Intel Core Ultra Series 2 processors can produce output on both the graphics processing unit (GPU) and the NPU much faster than a typical human can read.

"We are proud to lead the industry in enabling full NPU acceleration and industry-leading GPU performance for AI workloads on client PC platforms. This success reflects Intel's deep hardware-software co-optimization and commitment to democratizing AI for PCs everywhere," said Daniel Rogers, Intel vice president and general manager of PC Product Marketing.



With its Intel Core Ultra Series 2 processors, Intel is at the forefront of the AI PC evolution, offering unprecedented AI compute performance spanning the central processing unit (CPU), GPU and NPU.

MLPerf Client v0.6 measures four content generation and summarization use cases based on the Llama 2 7B model. Intel demonstrated leading performance across NPU and built-in Intel Arc GPU.

Intel achieved the fastest NPU response time, generating the first word in just 1.09 seconds (first token latency), meaning it begins answering almost immediately after receiving a prompt. It also delivered the highest NPU throughput at 18.55 tokens per second, referring to how quickly the system can generate each additional piece of text, enabling seamless real-time AI interaction. Additionally, compared to competition, Intel showed GPU leadership in time to first token, starting faster than the competition and reinforcing its NPU and GPU end-to-end AI acceleration advantage.

About NPU Benchmarking on MLPerf: Developed collaboratively by MLCommons consortium members—including Intel, AMD, Microsoft, NVIDIA and Qualcomm—MLPerf Client v0.6 extends beyond previous GPU-centric tests to now include dedicated NPU benchmarking.

Driven by close collaboration between Intel's NPU hardware and OpenVINO software teams, Intel Core Ultra processors remain the only NPU to achieve complete NPU compliance in the final benchmark.



View at TechPowerUp Main Site
 
Back
Top