• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Details Its Next-Gen Xeon Phi Processor

Joined
Dec 6, 2011
Messages
4,785 (2.14/day)
Likes
1,187
Location
Still on the East Side
#1
Intel Corporation today announced new details for its next-generation Intel Xeon Phi processors, code-named Knights Landing, which promise to extend the benefits of code modernization investments being made for current generation products. These include a new high-speed fabric that will be integrated on-package and high-bandwidth, on-package memory that combined, promise to accelerate the rate of scientific discovery. Currently memory and fabrics are available as discrete components in servers limiting the performance and density of supercomputers.

The new interconnect technology, called Intel Omni Scale Fabric, is designed to address the requirements of the next generations of high-performance computing (HPC). Intel Omni Scale Fabric will be integrated in the next generation of Intel Xeon Phi processors as well as future general-purpose Intel Xeon processors. This integration along with the fabric's HPC-optimized architecture is designed to address the performance, scalability, reliability, power and density requirements of future HPC deployments. It is designed to balance price and performance for entry-level through extreme-scale deployments.





"Intel is re-architecting the fundamental building block of HPC systems by integrating the Intel Omni Scale Fabric into Knights Landing, marking a significant inflection and milestone for the HPC industry," said Charles Wuischpard, vice president and general manager of Workstations and HPC at Intel. "Knights Landing will be the first true many-core processor to address today's memory and I/O performance challenges. It will allow programmers to leverage existing code and standard programming models to achieve significant performance gains on a wide set of applications. Its platform design, programming model and balanced performance makes it the first viable step towards exascale."

Knights Landing - Unmatched Integration
Knights Landing will be available as a standalone processor mounted directly on the motherboard socket in addition to the PCIe-based card option. The socketed option removes programming complexities and bandwidth bottlenecks of data transfer over PCIe, common in GPU and accelerator solutions. Knights Landing will include up to 16 GB high-bandwidth, on-package memory at launch - designed in partnership with Micron - to deliver five times better bandwidth compared to DDR4 memory, five times better energy efficiency and three times more density than current GDDR-based memory. When combined with integrated Intel Omni Scale Fabric, the new memory solution will allow Knights Landing to be installed as an independent compute building block, saving space and energy by reducing the number of components.

Powered by more than 60 HPC-enhanced Silvermont architecture-based cores, Knights Landing is expected to deliver more than 3 TFLOPS of double-precision performance and three times the single-threaded performance compared with the current generation. As a standalone server processor, Knights Landing will support DDR4 system memory comparable in capacity and bandwidth to Intel Xeon processor-based platforms, enabling applications that have a much larger memory footprint. Knights Landing will be binary-compatible with Intel Xeon processors, making it easy for software developers to reuse the wealth of existing code.

For customers preferring discrete components and a fast upgrade path without needing to upgrade other system components, both Knights Landing and Intel Omni Scale Fabric controllers will be available as separate PCIe-based add-on cards. There is application compatibility between currently available Intel True Scale Fabric and future Intel Omni Scale Fabric, so customers can transition to new fabric technology without change to their applications. For customers purchasing Intel True Scale Fabric today, Intel will offer a program to upgrade to Intel Omni Scale Fabric when it's available.

Knights Landing processors are scheduled to power HPC systems in the second half of 2015. For instance, in April the National Energy Research Scientific Computing Center (NERSC) announced an HPC installation planned for 2016, serving more than 5,000 users and over 700 extreme-scale science projects.

"We are excited about our partnership with Cray and Intel to develop NERSC's next supercomputer 'Cori,'" said Dr. Sudip Dosanjh, NERSC Director, Lawrence Berkeley National Laboratory. "Cori will consist of over 9,300 Intel Knights Landing processors and will serve as an on-ramp to exascale for our users through an accessible programming model. Our codes, which are often memory-bandwidth limited, will also greatly benefit from Knights Landing's high speed on package memory. We look forward to enabling new science that cannot be done on today's supercomputers."

New Fabric, New Speeds with Intel Omni Scale Fabric
Intel Omni Scale fabric is built upon a combination of enhanced acquired IP from Cray and QLogic, and Intel's own in-house innovations. It will include a full product line offering consisting of adapters, edge switches, director switch systems, and open-source fabric management and software tools. Additionally, traditional electrical transceivers in the director switches in today's fabrics will be replaced by Intel Silicon Photonics-based solutions, enabling increased port density, simplified cabling and reduced costs. Intel Silicon Photonics-based cabling and transceiver solutions may also be used with Intel Omni Scale-based processors, adapter cards and edge switches.

Intel Supercomputing Momentum Continues
The current generation of Intel Xeon processors and Intel Xeon Phi coprocessors powers the top-rated system in the world - the 35 PFLOPS "Milky Way 2" in China. Intel Xeon Phi coprocessors are also available in more than 200 OEM designs worldwide.

Intel-based systems account for 85 percent of all supercomputers on the 43rd edition of the TOP500 list announced today and 97 percent of all new additions. Within 18 months after the introduction of Intel's first many-core architecture products, Intel Xeon Phi coprocessor-based systems already make up 18 percent of the aggregated performance of all TOP500 supercomputers. The complete TOP500 list is available at www.top500.org.

To help optimize applications for many-core processing, Intel has also established more than 30 Intel Parallel Computing Centers (IPCC) in cooperation with universities and research facilities around the world. Today's parallel optimization investment with the Intel Xeon Phi coprocessor will carry forward to Knights Landing, as optimizations using standards-based, common programming languages persist with a recompile. Incremental tuning gains will be available to take advantage of innovative new functionality.
 
Joined
Mar 10, 2010
Messages
5,114 (1.78/day)
Likes
1,658
Location
Manchester uk
System Name Quad GT evo V
Processor FX8350 @ 4.8ghz1.525c NB2.64ghz Ht2.84ghz
Motherboard Gigabyte 990X Gaming
Cooling 360EK extreme 360Tt rad all push/pull, cpu,NB/Vrm blocks all EK
Memory Corsair vengeance 32Gb @1333 cas9
Video Card(s) Rx vega 64 waterblockedEK + Rx580 waterblockedEK
Storage samsung 840(250), WD 1Tb+2Tb +3Tbgrn 1tb hybrid
Display(s) Samsung uea28"850R 4k freesync, samsung 40" 1080p
Case Custom(modded) thermaltake Kandalf
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup
Power Supply corsair 1000Rmx
Mouse CM optane
Keyboard CM optane
Software Win 10 Pro
Benchmark Scores 15.69K best overall sandra so far
#2
Wow looking good , , still want one despite only one clear use for me, folding/crunching
 
Joined
Jun 27, 2011
Messages
5,628 (2.35/day)
Likes
2,986
Processor Intel I7 4790k (stock)
Motherboard ASRock H97M-ITX/ac LGA 1150 Intel H97
Cooling Prolimatech megahalem
Memory Crucial 2x4gb 1600mhz
Video Card(s) EVGA 1060 3gb
Storage OWC Mercury SSD 240 GB
Display(s) Asus 144hz
Case Raijintek Metis
Power Supply Corsair SF600 600w psu
Software Windows 10 64 Bit
#3
For cruncher/folding, I do wonder how well it compares to say a 16 core opteron.
 
Joined
Sep 7, 2011
Messages
2,785 (1.20/day)
Likes
1,672
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
#4
For cruncher/folding, I do wonder how well it compares to say a 16 core opteron.
Does F@H support Xeon Phi ? I was under the impression that it did not. That's assuming you've conquered the CPU and motherboard compatibility issues - doesn't Xeon Phi also require 64-bit PCI-E addressing?
 
Joined
Aug 11, 2011
Messages
4,335 (1.84/day)
Likes
3,020
Location
Mexico
System Name STEAMBOX | GAMECUBE | EQC (Everyday Quad Core)
Processor i5 4590@3.7Ghz |i7 3770K@4Ghz -|- Athlon 5350@2.52Ghz
Motherboard GA-B85N PHOENIX | Asrock Z77E-ITX | Asus AM1I-A
Cooling Stock | Antec Kuhler 620 | Reeven Vanxie
Memory 2x4GB ADATA XPG 1600Mhz | 2x4GB Kingston 1866Mhz -|- 2x4GB Crucial Ballistix@1920Mhz
Video Card(s) RX 480 Nitro | Sapphire RX 480 w/Accelero Mono Plus | HD 8400 @ 720Mhz (IGP)
Storage LiteON 128GB mSATA+3TB Seagate | Seagate 1TBxSamung 64GB SSD (Intel RST) | Kingston v300 240GB
Display(s) Daewoo 49" 1080p | ASUS PA248Q 1920x1200 IPS
Case Corsair 250D | CoolerMaster Elite 110 | Acteck Fiji
Audio Device(s) Onboard
Power Supply Seasonic SS-660XP2 | Silverstone SFX-450 | 200w mini FLEX PSU
Software Windows 10 64bit
#5
For cruncher/folding, I do wonder how well it compares to say a 16 core opteron.
The first Phi had 32 Pentium III based cores at 1.2Ghz for 750TFLOPs; it seems that Intel has switched to Atom this time and is claiming x3 single threaded performance in core vs core but take out the hypervisor overhead and maybe we'd be looking at a 2-2.5 increase.
 
Joined
Jun 27, 2011
Messages
5,628 (2.35/day)
Likes
2,986
Processor Intel I7 4790k (stock)
Motherboard ASRock H97M-ITX/ac LGA 1150 Intel H97
Cooling Prolimatech megahalem
Memory Crucial 2x4gb 1600mhz
Video Card(s) EVGA 1060 3gb
Storage OWC Mercury SSD 240 GB
Display(s) Asus 144hz
Case Raijintek Metis
Power Supply Corsair SF600 600w psu
Software Windows 10 64 Bit
#6
Does F@H support Xeon Phi ? I was under the impression that it did not. That's assuming you've conquered the CPU and motherboard compatibility issues - doesn't Xeon Phi also require 64-bit PCI-E addressing?
It probably does not, but I can still wonder. Even if it did, the barrier cost to have a system to put it in is probably pretty high. There are a few crunchers and folders with quad opteron server boards around which are also quite expensive, and what I would compare this too.

The first Phi had 32 Pentium III based cores at 1.2Ghz for 750TFLOPs; it seems that Intel has switched to Atom this time and is claiming x3 single threaded performance in core vs core but take out the hypervisor overhead and maybe we'd be looking at a 2-2.5 increase.
" Knights Landing is expected to deliver more than 3 TFLOPS of double-precision performance"

I am sure you read that, but I don't know how much a quad opteron server does. My old 7970, of which I no longer have, had 1TB double precision as I just read. 7970's do well in FAH, and this theoretically would be 3 times as much performance.
 
Joined
May 4, 2009
Messages
1,940 (0.61/day)
Likes
409
Location
Singapore
System Name penguin
Processor i3-4160
Motherboard Asus H81 Mini-ITX
Cooling Stock
Memory 2x4GB Kingston 1600MHz
Video Card(s) Saphire Radeon 7850 2GB
Storage Plextor M5S 120GB+1TB Seagate
Display(s) 23' Dell
Case CM Elite 130
Audio Device(s) stock
Power Supply Corsair CX430m
Software W7/Lubuntu
#7
For cruncher/folding, I do wonder how well it compares to say a 16 core opteron.
It favors very well... the highest clocked 16 core opti has a theoretical performance of ~180 GFLOPS. This thing is supposedly pulling ~3000GFLOPS.
 
Joined
Sep 7, 2011
Messages
2,785 (1.20/day)
Likes
1,672
Location
New Zealand
System Name MoneySink
Processor 2600K @ 4.8
Motherboard P8Z77-V
Cooling AC NexXxos XT45 360, RayStorm, D5T+XSPC tank, Tygon R-3603, Bitspower
Memory 16GB Crucial Ballistix DDR3-1600C8
Video Card(s) GTX 780 SLI (EVGA SC ACX + Giga GHz Ed.)
Storage Kingston HyperX SSD (128) OS, WD RE4 (1TB), RE2 (1TB), Cav. Black (2 x 500GB), Red (4TB)
Display(s) Achieva Shimian QH270-IPSMS (2560x1440) S-IPS
Case NZXT Switch 810
Audio Device(s) onboard Realtek yawn edition
Power Supply Seasonic X-1050
Software Win8.1 Pro
Benchmark Scores 3.5 litres of Pale Ale in 18 minutes.
#8
It favors very well... the highest clocked 16 core opti has a theoretical performance of ~180 GFLOPS. This thing is supposedly pulling ~3000GFLOPS.
Supposedly being the operative word. Xeon Phi isn't noted for it's actual vs. theoretical floating point performance
 
Joined
Aug 22, 2007
Messages
3,177 (0.84/day)
Likes
529
Location
Florida, US
System Name bits and pieces
Processor Intel Xeon E3-1230V3
Motherboard Gigabyte H97
Cooling stock
Memory 16GB for now
Video Card(s) EVGA GTX 970 SC
Storage 256GB SSD + 3x 2TB WDs (storage)
Display(s) 2x BENQ GW2250 + ViewSonic 24"
Case Cooler Master Centurion 5 :P
Audio Device(s) X-Fi Titanium HD --- JBL 4412 Studio Monitors / Polk PSW505
Power Supply Antec 550W
Mouse Corsair M65 pro
Keyboard MS Sidewinder
Software Win 10 Pro x64
#9
Supposedly being the operative word. Xeon Phi isn't noted for it's actual vs. theoretical floating point performance
this thing isn't even out.....
we'll ahve to wait a bit to see how it actually pans out.....
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
10,511 (4.82/day)
Likes
5,596
Location
Concord, NH
System Name Kratos
Processor Intel Core i7 3930k @ 4.2Ghz
Motherboard ASUS P9X79 Deluxe
Cooling Zalman CPNS9900MAX 130mm
Memory G.Skill DDR3-2133, 16gb (4x4gb) @ 9-11-10-28-108-1T 1.65v
Video Card(s) MSI AMD Radeon R9 390 GAMING 8GB @ PCI-E 3.0
Storage 2x120Gb SATA3 Corsair Force GT Raid-0, 4x1Tb RAID-5, 1x500GB
Display(s) 1x LG 27UD69P (4k), 2x Dell S2340M (1080p)
Case Antec 1200
Audio Device(s) Onboard Realtek® ALC898 8-Channel High Definition Audio
Power Supply Seasonic 1000-watt 80 PLUS Platinum
Mouse Logitech G602
Keyboard Rosewill RK-9100
Software Ubuntu 17.10
Benchmark Scores Benchmarks aren't everything.
#10
...but people aren't asking that one question that floats in the back of people's minds. Can it run Crysis?

I know that's a silly question, but it does expand to a larger question which is; how will these changes impact the consumer market and how long will it take for any said change to happen? I suspect if Intel thinks doing this kind of thing is worth while, we might start seeing it in their other products in some shape or form. Image buying a motherboard with no dimm slots because the DRAM is on the CPU. That doesn't just improve latencies, it reduces pins on the CPU and traces on the motherboard. After all we did see Haswell with the VRMs on the CPU, so I would imagine Intel would keep looking for things that might belong there. Just a thought.
 
Joined
Aug 10, 2007
Messages
4,064 (1.07/day)
Likes
1,130
Location
Geneva, FL, USA
Processor Intel i5-6600
Motherboard ASRock H170M-ITX
Cooling Cooler Master Geminii S524
Memory G.Skill DDR4-2133 16GB (8GB x 2)
Video Card(s) Gigabyte R9-380X 4GB
Storage Samsung 950 EVO 250GB (mSATA)
Display(s) LG 29UM69G-B 2560x1080 IPS
Case Lian Li PC-Q25
Audio Device(s) Realtek ALC892
Power Supply Seasonic SS-460FL2
Mouse Logitech G700s
Keyboard Logitech G110
Software Windows 10 Pro
#11
...but people aren't asking that one question that floats in the back of people's minds. Can it run Crysis?

I know that's a silly question, but it does expand to a larger question which is; how will these changes impact the consumer market and how long will it take for any said change to happen? I suspect if Intel thinks doing this kind of thing is worth while, we might start seeing it in their other products in some shape or form. Image buying a motherboard with no dimm slots because the DRAM is on the CPU. That doesn't just improve latencies, it reduces pins on the CPU and traces on the motherboard. After all we did see Haswell with the VRMs on the CPU, so I would imagine Intel would keep looking for things that might belong there. Just a thought.
Not that silly. Daniel Pohl of Intel had his ray-traced version of Wolfenstein running on 8 of the original Knights Ferry. I'm sure a demonstration is being prepared using these upcoming models to show off the benefits of on-package memory, core design, and scaling interface.