• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Centaur Releases In-Depth Analysis from The Linley Group for its NCORE-Equipped x86 Processor

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
46,362 (7.68/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Centaur Technology today revealed in-depth information about its new processor-design technology for integrating high-performance x86 CPUs with a specialized co-processor optimized for artificial intelligence (AI) acceleration. On its website, Centaur provides a new independent report from The Linley Group, the industry's leading authority on microprocessor technology and publishers of Microprocessor Report. The Linley Group reviewed Centaur's detailed design documents and interviewed Centaur's CPU and AI architects to support the analysis of both Centaur's newest x86 microarchitecture and the AI co-processor design.

"Centaur is galloping back into the x86 market with an innovative processor design that combines eight high-performance CPUs with a custom deep-learning accelerator (DLA). The company is the first to announce a server-processor design that integrates a DLA. The new accelerator, called Ncore, delivers better neural-network performance than even the most powerful Xeon, but without the high cost of an external GPU card," stated Linley Gwennap, Editor-in-Chief, Microprocessor Report.



The report can be accessed here (PDF).

The Linley Group referenced certified MLPerf benchmark (Preview) scores to compare Centaur's AI performance to high-end x86 CPU cores from the leading x86 vendor. Based on MLPerf scores, Centaur's AI-coprocessor inference performance is comparable to 23 of Intel's world-class x86 cores that now support 512-bit vector neural network instructions (VNNI). Centaur's AI co-processor uses an architecturally similar single-instruction-multiple-data (SIMD) approach as VNNI, but crunches 32,768 bits in a single clock cycle using a 16 MB memory with 20 terabytes/sec of bandwidth. Moreover, by offloading inference processing to a specialized co-processor, the x86 CPU cores remain available for other, more general-purpose tasks. Application developers can innovate new algorithms that take advantage of the unparalleled inference latency enabled by Centaur's AI performance and tight integration with x86 CPUs.

Attendees at the ISC East trade show in NYC saw Centaur's new technology up close for the first time. The demo showcased video analytics using Centaur's reference system with x86-based network-video-recording (NVR) software from Qvis Labs. In addition to conventional, real-time object detection/classification, Centaur was the only vendor at the show to highlight leading-edge applications such as semantic segmentation (pixel-level image classification) and a new technique for human pose estimation ("stick figures"). Centaur is focused on improving the hardware price/performance and software productivity for platforms to support this next wave of research applications and speed deployment into new server-class products.

View at TechPowerUp Main Site
 
Joined
Jan 8, 2017
Messages
8,929 (3.36/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
better neural-network performance than even the most powerful Xeon, but without the high cost of an external GPU card

And also without the wide array of general purpose computation a GPU brings. It's not that straight forward, I wish all of these companies that make dedicated AI accelerators would stop making these dramatic comparisons.
 

Cheeseball

Not a Potato
Supporter
Joined
Jan 2, 2009
Messages
1,844 (0.33/day)
Location
Pittsburgh, PA
System Name Titan
Processor AMD Ryzen™ 7 7950X3D
Motherboard ASUS ROG Strix X670E-I Gaming WiFi
Cooling ID-COOLING SE-207-XT Slim Snow
Memory TEAMGROUP T-Force Delta RGB 2x16GB DDR5-6000 CL30
Video Card(s) ASRock Radeon RX 7900 XTX 24 GB GDDR6 (MBA)
Storage 2TB Samsung 990 Pro NVMe
Display(s) AOpen Fire Legend 24" (25XV2Q), Dough Spectrum One 27" (Glossy), LG C4 42" (OLED42C4PUA)
Case ASUS Prime AP201 33L White
Audio Device(s) Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Cloud Alpha Wireless
Power Supply Corsair SF1000L
Mouse Logitech Pro Superlight (White), G303 Shroud Edition
Keyboard Wooting 60HE / NuPhy Air75 v2
VR HMD Occulus Quest 2 128GB
Software Windows 11 Pro 64-bit 23H2 Build 22631.3447
And also without the wide array of general purpose computation a GPU brings. It's not that straight forward, I wish all of these companies that make dedicated AI accelerators would stop making these dramatic comparisons.

The point of the Ncore is it's cost effectiveness. There is no need to use separate P100s or MI50s when this has an already-capable DLA which properly supports AVX-512 (and apparently VNNI). You would then need to couple those GPUs with a Xeon or Epyc, which would push the costs higher.
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
46,362 (7.68/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.
 
Joined
Jul 15, 2006
Messages
977 (0.15/day)
Location
Malaysia
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B450M-S2H
Cooling Scythe Kotetsu Mark II
Memory 2 x 16GB SK Hynix OEM DDR4-3200 @ 3666 18-20-18-36
Video Card(s) Colorful RTX 2060 SUPER 8GB
Storage 250GB WD BLACK SN750 M.2 + 4TB WD Red Plus + 4TB WD Purple
Display(s) AOpen 27HC5R 27" 1080p 165Hz
Case COUGAR MX440 Mesh RGB
Audio Device(s) Creative X-Fi Titanium HD + Kurtzweil KS-40A bookshelf
Power Supply Corsair CX750M
Mouse Razer Deathadder Essential
Keyboard Cougar Attack2 Cherry MX Black
Software Windows 10 Pro 22H1 x64
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.
You take the word out of my mind! If Intel going to enter GPU market as third alternative, I would like to see third contender in x86 CPU market.
 
Joined
Nov 20, 2012
Messages
422 (0.10/day)
Location
Hungary
System Name masina
Processor AMD Ryzen 5 3600
Motherboard ASUS TUF B550M
Cooling Scythe Kabuto 3 + Arctic BioniX P120 fan
Memory 16GB (2x8) DDR4-3200 CL16 Crucial Ballistix
Video Card(s) Radeon Pro WX 2100 2GB
Storage 500GB Crucial MX500, 640GB WD Black
Display(s) AOC C24G1
Case SilentiumPC AT6V
Power Supply Seasonic Focus GX 650W
Mouse Logitech G203
Keyboard Cooler Master MasterKeys L PBT
Software Win 10 Pro
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.

Then we would only need S3 Graphics, PowerVR Kyro dGPUs and Transmeta to rematerialize out of thin air to complete the early 2000s infinity gauntlet of IT. :D
 
Joined
Jan 31, 2011
Messages
238 (0.05/day)
Processor 3700X
Motherboard X570 TUF Plus
Cooling U12
Memory 32GB 3600MHz
Video Card(s) eVGA GTX970
Storage 512GB 970 Pro
Case CM 500L vertical
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.

It's on TSMC 16nm, and it's an 8 core CPU that runs at 2.5GHz with no indication of turbo. Lots of talk about power efficiency without any metrics, comparisons, or numbers, so I am not expecting much there, either.

In the end, I expect this to be a very poor client-desktop product. Its niche is the integrated wide SIMD core.
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
46,362 (7.68/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
In the end, I expect this to be a very poor client-desktop product. Its niche is the integrated wide SIMD core.

I'm not so sure. So far their prototype was shown handling a very specific application (image recognition across multiple video streams), which probably runs fine with this CPU configuration.

As you said there was no comment made from them on power or clock-speed headroom. With the right 10 nm class (12/14/16 FF) node, they might be able to come up with a client-segment product. If they've achieved single-thread parity with Zen+, then all they need is to sustain 3.80-4.00 GHz to torment current Core i5 chips. The only thing stopping this chip from hurting Pentium/Celeron/Core i3 is the lack of an iGPU. I doubt if VIA can pull off a contemporary iGPU today. So their embedded motherboards will have to bundle something like a GeForce MX150.
 
Joined
Mar 1, 2008
Messages
282 (0.05/day)
Location
Antwerp, Belgium
I doubt if VIA can pull off a contemporary iGPU today.

Considering their last Chrome core was very small on a 65nm processes and had 32 cores , I would say that just scaling that to 256 cores and DX11 would make them competitive with UHD 630.
 
Joined
May 31, 2016
Messages
4,324 (1.50/day)
Location
Currently Norway
System Name Bro2
Processor Ryzen 5800X
Motherboard Gigabyte X570 Aorus Elite
Cooling Corsair h115i pro rgb
Memory 16GB G.Skill Flare X 3200 CL14 @3800Mhz CL16
Video Card(s) Powercolor 6900 XT Red Devil 1.1v@2400Mhz
Storage M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD / LG 27GN950
Case Fractal Design G
Audio Device(s) Realtec 5.1
Power Supply Seasonic 750W GOLD
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
I'm not so sure. So far their prototype was shown handling a very specific application (image recognition across multiple video streams), which probably runs fine with this CPU configuration.

As you said there was no comment made from them on power or clock-speed headroom. With the right 10 nm class (12/14/16 FF) node, they might be able to come up with a client-segment product. If they've achieved single-thread parity with Zen+, then all they need is to sustain 3.80-4.00 GHz to torment current Core i5 chips. The only thing stopping this chip from hurting Pentium/Celeron/Core i3 is the lack of an iGPU. I doubt if VIA can pull off a contemporary iGPU today. So their embedded motherboards will have to bundle something like a GeForce MX150.
I think you are talking here about laptops? If so the iGPU is not the most concern here but power consumption and heat. They can always use GF MX150 but lowering power consumption and heat would require them to go lower on clocks and lower performance which i doubt is that impressive in comparison to i3 or i5 or ryzen 1s gen
 
Joined
Nov 4, 2005
Messages
11,683 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs and over 10TB spinning
Display(s) 56" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
The point of the Ncore is it's cost effectiveness. There is no need to use separate P100s or MI50s when this has an already-capable DLA which properly supports AVX-512 (and apparently VNNI). You would then need to couple those GPUs with a Xeon or Epyc, which would push the costs higher.


I suppose they are going to give them away for the good of humanity, and the companies buying them will make do on their own if they need support.
 
Joined
Sep 24, 2019
Messages
64 (0.04/day)
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.

I'm skeptical of the single-thread performance this uarch can offer. The various buffers are too small (or non-existant in the case of a uop cache), relative to the dispatch rate. I think Haswell-level performance should be the top-end of performance estimates, rather than a starting point. Coupled with a lack of SMT, there's no way the individual cores will be running at capacity in real-world workload.
 

Cheeseball

Not a Potato
Supporter
Joined
Jan 2, 2009
Messages
1,844 (0.33/day)
Location
Pittsburgh, PA
System Name Titan
Processor AMD Ryzen™ 7 7950X3D
Motherboard ASUS ROG Strix X670E-I Gaming WiFi
Cooling ID-COOLING SE-207-XT Slim Snow
Memory TEAMGROUP T-Force Delta RGB 2x16GB DDR5-6000 CL30
Video Card(s) ASRock Radeon RX 7900 XTX 24 GB GDDR6 (MBA)
Storage 2TB Samsung 990 Pro NVMe
Display(s) AOpen Fire Legend 24" (25XV2Q), Dough Spectrum One 27" (Glossy), LG C4 42" (OLED42C4PUA)
Case ASUS Prime AP201 33L White
Audio Device(s) Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Cloud Alpha Wireless
Power Supply Corsair SF1000L
Mouse Logitech Pro Superlight (White), G303 Shroud Edition
Keyboard Wooting 60HE / NuPhy Air75 v2
VR HMD Occulus Quest 2 128GB
Software Windows 11 Pro 64-bit 23H2 Build 22631.3447
I suppose they are going to give them away for the good of humanity, and the companies buying them will make do on their own if they need support.

Not sure what you're getting at.

For the market this is currently aimed at, the only support needed would be basic delivery, initial implementation (API and documentation) and aftersales repair/replacement for any defects.
 
Joined
Nov 4, 2005
Messages
11,683 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs and over 10TB spinning
Display(s) 56" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Not sure what you're getting at.

For the market this is currently aimed at, the only support needed would be basic delivery, initial implementation (API and documentation) and aftersales repair/replacement for any defects.


My point is there is so much more to the ecosystem than just dropping in the new latest and greatest CPU, it takes much more to make an efficient and actual "cost effective" system than the simple PR spin here.
 

Cheeseball

Not a Potato
Supporter
Joined
Jan 2, 2009
Messages
1,844 (0.33/day)
Location
Pittsburgh, PA
System Name Titan
Processor AMD Ryzen™ 7 7950X3D
Motherboard ASUS ROG Strix X670E-I Gaming WiFi
Cooling ID-COOLING SE-207-XT Slim Snow
Memory TEAMGROUP T-Force Delta RGB 2x16GB DDR5-6000 CL30
Video Card(s) ASRock Radeon RX 7900 XTX 24 GB GDDR6 (MBA)
Storage 2TB Samsung 990 Pro NVMe
Display(s) AOpen Fire Legend 24" (25XV2Q), Dough Spectrum One 27" (Glossy), LG C4 42" (OLED42C4PUA)
Case ASUS Prime AP201 33L White
Audio Device(s) Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Cloud Alpha Wireless
Power Supply Corsair SF1000L
Mouse Logitech Pro Superlight (White), G303 Shroud Edition
Keyboard Wooting 60HE / NuPhy Air75 v2
VR HMD Occulus Quest 2 128GB
Software Windows 11 Pro 64-bit 23H2 Build 22631.3447
My point is there is so much more to the ecosystem than just dropping in the new latest and greatest CPU, it takes much more to make an efficient and actual "cost effective" system than the simple PR spin here.

Hmm.. that really does depend on the use-case though. This sounds like it would be more cost effective when establishing (or adding) a new cluster than just adding on more accelerators.

It's The Linley Group adding the PR, not VIA/Centaur themselves. They speculate that if this is priced the same as the Xeon Silver, they would be getting the accelerator for free, in a sense.
 
Top