• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Centaur Releases In-Depth Analysis from The Linley Group for its NCORE-Equipped x86 Processor

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
41,916 (8.17/day)
Location
Hyderabad, India
Processor AMD Ryzen 7 2700X
Motherboard ASUS ROG Strix B450-E Gaming
Cooling AMD Wraith Prism
Memory 2x 16GB Corsair Vengeance LPX DDR4-3000
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) Creative Sound Blaster Recon3D PCIe
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Microsoft Sidewinder X4
Software Windows 10 Pro
Centaur Technology today revealed in-depth information about its new processor-design technology for integrating high-performance x86 CPUs with a specialized co-processor optimized for artificial intelligence (AI) acceleration. On its website, Centaur provides a new independent report from The Linley Group, the industry's leading authority on microprocessor technology and publishers of Microprocessor Report. The Linley Group reviewed Centaur's detailed design documents and interviewed Centaur's CPU and AI architects to support the analysis of both Centaur's newest x86 microarchitecture and the AI co-processor design.

"Centaur is galloping back into the x86 market with an innovative processor design that combines eight high-performance CPUs with a custom deep-learning accelerator (DLA). The company is the first to announce a server-processor design that integrates a DLA. The new accelerator, called Ncore, delivers better neural-network performance than even the most powerful Xeon, but without the high cost of an external GPU card," stated Linley Gwennap, Editor-in-Chief, Microprocessor Report.



The report can be accessed here (PDF).

The Linley Group referenced certified MLPerf benchmark (Preview) scores to compare Centaur's AI performance to high-end x86 CPU cores from the leading x86 vendor. Based on MLPerf scores, Centaur's AI-coprocessor inference performance is comparable to 23 of Intel's world-class x86 cores that now support 512-bit vector neural network instructions (VNNI). Centaur's AI co-processor uses an architecturally similar single-instruction-multiple-data (SIMD) approach as VNNI, but crunches 32,768 bits in a single clock cycle using a 16 MB memory with 20 terabytes/sec of bandwidth. Moreover, by offloading inference processing to a specialized co-processor, the x86 CPU cores remain available for other, more general-purpose tasks. Application developers can innovate new algorithms that take advantage of the unparalleled inference latency enabled by Centaur's AI performance and tight integration with x86 CPUs.

Attendees at the ISC East trade show in NYC saw Centaur's new technology up close for the first time. The demo showcased video analytics using Centaur's reference system with x86-based network-video-recording (NVR) software from Qvis Labs. In addition to conventional, real-time object detection/classification, Centaur was the only vendor at the show to highlight leading-edge applications such as semantic segmentation (pixel-level image classification) and a new technique for human pose estimation ("stick figures"). Centaur is focused on improving the hardware price/performance and software productivity for platforms to support this next wave of research applications and speed deployment into new server-class products.

View at TechPowerUp Main Site
 
Joined
Jan 8, 2017
Messages
6,962 (3.98/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Deepcool Gammaxx L240 V2
Memory 16GB - Corsair Vengeance LPX - 3333 Mhz CL16
Video Card(s) OEM Dell GTX 1080 with Kraken G12 + Water 3.0 Performer C
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Deepcool Matrexx 70
Power Supply GPS-750C
better neural-network performance than even the most powerful Xeon, but without the high cost of an external GPU card

And also without the wide array of general purpose computation a GPU brings. It's not that straight forward, I wish all of these companies that make dedicated AI accelerators would stop making these dramatic comparisons.
 

Cheeseball

Not a Potato
Supporter
Joined
Jan 2, 2009
Messages
1,501 (0.32/day)
Location
Pittsburgh, PA
System Name Custom AMD Rig
Processor AMD Ryzen™ 7 3800X
Motherboard ASUS TUF GAMING X570-PLUS (WI-FI)
Cooling EVGA CLC 280mm AIO Liquid Cooler
Memory G.SKILL TridentZ 32GB (8GBx4) F4-3200C16-8GTZR
Video Card(s) EVGA GeForce RTX 3080 FTW3 ULTRA GAMING 10GB
Storage 250GB Samsung 970 EVO NVMe, 2TB Inland Premium NVMe, 1TB Crucial MX500 SATA, 4TB WD Blue SATA
Display(s) Acer Nitro XV340CK Pbmiipphzx 34" UWQHD 1440p, LG 27GN850-B UltraGear 27" 1440p 144 Hz
Case NZXT H510i Matte White
Audio Device(s) Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Blue Yeti
Power Supply Corsair RMx Series RM750x 750W
Mouse Kingston HyperX Pulsefire Haste
Keyboard Kingston HyperX Alloy Origins Core
Software Windows 10 Pro 64-bit 20H2
And also without the wide array of general purpose computation a GPU brings. It's not that straight forward, I wish all of these companies that make dedicated AI accelerators would stop making these dramatic comparisons.

The point of the Ncore is it's cost effectiveness. There is no need to use separate P100s or MI50s when this has an already-capable DLA which properly supports AVX-512 (and apparently VNNI). You would then need to couple those GPUs with a Xeon or Epyc, which would push the costs higher.
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
41,916 (8.17/day)
Location
Hyderabad, India
Processor AMD Ryzen 7 2700X
Motherboard ASUS ROG Strix B450-E Gaming
Cooling AMD Wraith Prism
Memory 2x 16GB Corsair Vengeance LPX DDR4-3000
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) Creative Sound Blaster Recon3D PCIe
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Microsoft Sidewinder X4
Software Windows 10 Pro
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.
 
Joined
Jul 15, 2006
Messages
878 (0.16/day)
Location
Malaysia
Processor AMD Ryzen 5 3600
Motherboard Gigabyte B450M-S2H
Cooling Arctic Freezer 64 Pro
Memory 2x8GB Samsung OEM B-die @ 3600 17-19-18-36
Video Card(s) Galax GTX 1070 Katana
Storage 250GB WD BLACK SN750 + 2TB Seagate Surveillance 5900rpm
Display(s) 27" ACER AOpen 27HC5RP curve 165Hz monitor
Case Mars Gaming MCX Midtower
Audio Device(s) Creative X-Fi Titanium HD + Kurtzweil KS-40A bookshelf
Power Supply Corsair CX750M
Mouse Logitech G402 Hyperion Fury
Keyboard CM Storm QuickFire Pro Cherry MX Black
Software Windows 10 Pro x64
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.
You take the word out of my mind! If Intel going to enter GPU market as third alternative, I would like to see third contender in x86 CPU market.
 
Joined
Nov 20, 2012
Messages
421 (0.13/day)
Location
Hungary
System Name masina
Processor AMD Ryzen 5 3600
Motherboard ASUS TUF B550M
Cooling Scythe Kabuto 3 + Arctic BioniX P120 fan
Memory 16GB (2x8) DDR4-3200 CL16 Crucial Ballistix
Video Card(s) Radeon Pro WX 2100 2GB
Storage 500GB Crucial MX500, 640GB WD Black
Display(s) AOC C24G1
Case SilentiumPC AT6V
Power Supply Seasonic Focus GX 650W
Mouse Logitech G203
Keyboard Cooler Master MasterKeys L PBT
Software Win 10 Pro
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.

Then we would only need S3 Graphics, PowerVR Kyro dGPUs and Transmeta to rematerialize out of thin air to complete the early 2000s infinity gauntlet of IT. :D
 
Joined
Jan 31, 2011
Messages
192 (0.05/day)
Processor 3700X
Motherboard X570 TUF Plus
Cooling U12
Memory 32GB 3600MHz
Video Card(s) eVGA GTX970
Storage 512GB 970 Pro
Case CM 500L vertical
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.

It's on TSMC 16nm, and it's an 8 core CPU that runs at 2.5GHz with no indication of turbo. Lots of talk about power efficiency without any metrics, comparisons, or numbers, so I am not expecting much there, either.

In the end, I expect this to be a very poor client-desktop product. Its niche is the integrated wide SIMD core.
 

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
41,916 (8.17/day)
Location
Hyderabad, India
Processor AMD Ryzen 7 2700X
Motherboard ASUS ROG Strix B450-E Gaming
Cooling AMD Wraith Prism
Memory 2x 16GB Corsair Vengeance LPX DDR4-3000
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) Creative Sound Blaster Recon3D PCIe
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Microsoft Sidewinder X4
Software Windows 10 Pro
In the end, I expect this to be a very poor client-desktop product. Its niche is the integrated wide SIMD core.

I'm not so sure. So far their prototype was shown handling a very specific application (image recognition across multiple video streams), which probably runs fine with this CPU configuration.

As you said there was no comment made from them on power or clock-speed headroom. With the right 10 nm class (12/14/16 FF) node, they might be able to come up with a client-segment product. If they've achieved single-thread parity with Zen+, then all they need is to sustain 3.80-4.00 GHz to torment current Core i5 chips. The only thing stopping this chip from hurting Pentium/Celeron/Core i3 is the lack of an iGPU. I doubt if VIA can pull off a contemporary iGPU today. So their embedded motherboards will have to bundle something like a GeForce MX150.
 
Joined
Mar 1, 2008
Messages
261 (0.05/day)
Location
Antwerp, Belgium
I doubt if VIA can pull off a contemporary iGPU today.

Considering their last Chrome core was very small on a 65nm processes and had 32 cores , I would say that just scaling that to 256 cores and DX11 would make them competitive with UHD 630.
 
Joined
May 31, 2016
Messages
2,973 (1.51/day)
Location
Currently Norway
System Name Bro2
Processor Ryzen 5800X
Motherboard Gigabyte X570 Aorus Elite
Cooling Corsair h115i pro rgb
Memory 16GB G.Skill Flare X 3200 CL14
Video Card(s) Powercolor 6900 XT Red Devil 1.1v@2400Mhz
Storage M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD / LG 27GN950
Case Fractal Design G
Audio Device(s) Realtec 5.1
Power Supply Corsair AXi 760W / Seasonic 750W GOLD
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
I'm not so sure. So far their prototype was shown handling a very specific application (image recognition across multiple video streams), which probably runs fine with this CPU configuration.

As you said there was no comment made from them on power or clock-speed headroom. With the right 10 nm class (12/14/16 FF) node, they might be able to come up with a client-segment product. If they've achieved single-thread parity with Zen+, then all they need is to sustain 3.80-4.00 GHz to torment current Core i5 chips. The only thing stopping this chip from hurting Pentium/Celeron/Core i3 is the lack of an iGPU. I doubt if VIA can pull off a contemporary iGPU today. So their embedded motherboards will have to bundle something like a GeForce MX150.
I think you are talking here about laptops? If so the iGPU is not the most concern here but power consumption and heat. They can always use GF MX150 but lowering power consumption and heat would require them to go lower on clocks and lower performance which i doubt is that impressive in comparison to i3 or i5 or ryzen 1s gen
 
Joined
Nov 4, 2005
Messages
10,839 (1.86/day)
System Name MoFo 2
Processor AMD PhenomII 1100T @ 4.2Ghz
Motherboard Asus Crosshair IV
Cooling Swiftec 655 pump, Apogee GT,, MCR360mm Rad, 1/2 loop.
Memory 8GB DDR3-2133 @ 1900 8.9.9.24 1T
Video Card(s) HD7970 1250/1750
Storage Agility 3 SSD 6TB RAID 0 on RAID Card
Display(s) 46" 1080P Toshiba LCD
Case Rosewill R6A34-BK modded (thanks to MKmods)
Audio Device(s) ATI HDMI
Power Supply 750W PC Power & Cooling modded (thanks to MKmods)
Software A lot.
Benchmark Scores Its fast. Enough.
The point of the Ncore is it's cost effectiveness. There is no need to use separate P100s or MI50s when this has an already-capable DLA which properly supports AVX-512 (and apparently VNNI). You would then need to couple those GPUs with a Xeon or Epyc, which would push the costs higher.


I suppose they are going to give them away for the good of humanity, and the companies buying them will make do on their own if they need support.
 
Joined
Sep 24, 2019
Messages
64 (0.08/day)
If CNS core single-thread performance ends up somewhere between Haswell and Skylake (Zen+ level), then it would be a tragedy for Centaur not to attempt a client-desktop product. Just take this 8-core CPU block, lose the DLA component, lose the inter-socket interconnect, slim the memory controller to 2-channel, slim the PCIe to 24 lanes, maybe strike a deal with GloFo for 12LP manufacturing, and get the thing out by Computex 2020. It should prove an interesting Core i3-Core i5 alternative.

I'm skeptical of the single-thread performance this uarch can offer. The various buffers are too small (or non-existant in the case of a uop cache), relative to the dispatch rate. I think Haswell-level performance should be the top-end of performance estimates, rather than a starting point. Coupled with a lack of SMT, there's no way the individual cores will be running at capacity in real-world workload.
 

Cheeseball

Not a Potato
Supporter
Joined
Jan 2, 2009
Messages
1,501 (0.32/day)
Location
Pittsburgh, PA
System Name Custom AMD Rig
Processor AMD Ryzen™ 7 3800X
Motherboard ASUS TUF GAMING X570-PLUS (WI-FI)
Cooling EVGA CLC 280mm AIO Liquid Cooler
Memory G.SKILL TridentZ 32GB (8GBx4) F4-3200C16-8GTZR
Video Card(s) EVGA GeForce RTX 3080 FTW3 ULTRA GAMING 10GB
Storage 250GB Samsung 970 EVO NVMe, 2TB Inland Premium NVMe, 1TB Crucial MX500 SATA, 4TB WD Blue SATA
Display(s) Acer Nitro XV340CK Pbmiipphzx 34" UWQHD 1440p, LG 27GN850-B UltraGear 27" 1440p 144 Hz
Case NZXT H510i Matte White
Audio Device(s) Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Blue Yeti
Power Supply Corsair RMx Series RM750x 750W
Mouse Kingston HyperX Pulsefire Haste
Keyboard Kingston HyperX Alloy Origins Core
Software Windows 10 Pro 64-bit 20H2
I suppose they are going to give them away for the good of humanity, and the companies buying them will make do on their own if they need support.

Not sure what you're getting at.

For the market this is currently aimed at, the only support needed would be basic delivery, initial implementation (API and documentation) and aftersales repair/replacement for any defects.
 
Joined
Nov 4, 2005
Messages
10,839 (1.86/day)
System Name MoFo 2
Processor AMD PhenomII 1100T @ 4.2Ghz
Motherboard Asus Crosshair IV
Cooling Swiftec 655 pump, Apogee GT,, MCR360mm Rad, 1/2 loop.
Memory 8GB DDR3-2133 @ 1900 8.9.9.24 1T
Video Card(s) HD7970 1250/1750
Storage Agility 3 SSD 6TB RAID 0 on RAID Card
Display(s) 46" 1080P Toshiba LCD
Case Rosewill R6A34-BK modded (thanks to MKmods)
Audio Device(s) ATI HDMI
Power Supply 750W PC Power & Cooling modded (thanks to MKmods)
Software A lot.
Benchmark Scores Its fast. Enough.
Not sure what you're getting at.

For the market this is currently aimed at, the only support needed would be basic delivery, initial implementation (API and documentation) and aftersales repair/replacement for any defects.


My point is there is so much more to the ecosystem than just dropping in the new latest and greatest CPU, it takes much more to make an efficient and actual "cost effective" system than the simple PR spin here.
 

Cheeseball

Not a Potato
Supporter
Joined
Jan 2, 2009
Messages
1,501 (0.32/day)
Location
Pittsburgh, PA
System Name Custom AMD Rig
Processor AMD Ryzen™ 7 3800X
Motherboard ASUS TUF GAMING X570-PLUS (WI-FI)
Cooling EVGA CLC 280mm AIO Liquid Cooler
Memory G.SKILL TridentZ 32GB (8GBx4) F4-3200C16-8GTZR
Video Card(s) EVGA GeForce RTX 3080 FTW3 ULTRA GAMING 10GB
Storage 250GB Samsung 970 EVO NVMe, 2TB Inland Premium NVMe, 1TB Crucial MX500 SATA, 4TB WD Blue SATA
Display(s) Acer Nitro XV340CK Pbmiipphzx 34" UWQHD 1440p, LG 27GN850-B UltraGear 27" 1440p 144 Hz
Case NZXT H510i Matte White
Audio Device(s) Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Blue Yeti
Power Supply Corsair RMx Series RM750x 750W
Mouse Kingston HyperX Pulsefire Haste
Keyboard Kingston HyperX Alloy Origins Core
Software Windows 10 Pro 64-bit 20H2
My point is there is so much more to the ecosystem than just dropping in the new latest and greatest CPU, it takes much more to make an efficient and actual "cost effective" system than the simple PR spin here.

Hmm.. that really does depend on the use-case though. This sounds like it would be more cost effective when establishing (or adding) a new cluster than just adding on more accelerators.

It's The Linley Group adding the PR, not VIA/Centaur themselves. They speculate that if this is priced the same as the Xeon Silver, they would be getting the accelerator for free, in a sense.
 
Top