Thursday, August 30th 2012

Intel Reveals Architecture Details of Intel Xeon Phi Co-Processor

During HotChips symposium, George Chrysos, the leading architect of Intel Xeon Phi co-processor shared the new architecture details of upcoming Intel's HPC powerhouse. Designed for highly-parallel applications, Intel Xeon Phi co-processor based on Intel Mani Integrated Core architecture will deliver the combination of industry leading performance per watt with the ability to re-use the existing code and applications without necessity of re-writing them.

Equipped with more than 50 cores and built using Intel's latest 22nm 3D Tri-gate transistor technology, new co-processors will be in production this year with first supercomputers from top500 list already taking advantage of this technology. In his blog here, George shares his aspirations and goals during designing the co-processor and summarizes all new disclosed information. The HotChips presentation is also available below.

Add your own comment

24 Comments on Intel Reveals Architecture Details of Intel Xeon Phi Co-Processor

#1
Eva01Master
This is just what I was looking for, now my browser games will run smoothly XD.
Posted on Reply
#3
Mussels
Moderprator
so this is what became of larrabee
Posted on Reply
#4
TRWOV
It's like the Ageia PPU all over again... also, when would I be able to put one in my tower to encode my porn? :)
Posted on Reply
#5
eidairaman1
by: TRWOV
It's like the Ageia PPU all over again... also, when would I be able to put one in my tower to encode my porn? :)
Go Get Laid Heh
Posted on Reply
#6
INSTG8R
by: Mussels
so this is what became of larrabee
That's the bells that went off in my head as well.
Posted on Reply
#7
Completely Bonkers
Rather lackluster in my opinion. It is a "me too". Intel Knights Corner does not make any leaps ahead of the competition. It is neck and neck. And performance per watt improvements? Marginal given they havent launched yet and the competitors architecture is more than a year old. According to Moore, Intel should be at least 2x ahead of last years offerings from nV and AMD.

In fact, they are probably behind. That chart (slide 5) with 1381 vs. 1380 was finagled for the boss so that the team didn't lose their jobs!
Posted on Reply
#8
FordGT90Concept
"I go fast!1!11!1!"
It is x86. That is the "leaps ahead of the competition." You don't need to code for it, just recompile for it. It makes developing software so much easier. Not to mention, it's also better at scientific applications because it has higher double precision float performance because x86 is designed to do that--not so much GPUs converted to GPGPU use.
Posted on Reply
#9
Delta6326
Sounds interesting still wish intel would join the other GPU makers so that prices would go down.
Posted on Reply
#10
largon
Strange.
.
.
.
Nobody has (yet) complained about PCB color (slide 4).
by: eidairaman1
[quote="TRWOV, post: 2710317"][...] when would I be able to put one in my tower to encode my porn?
Go Get Laid Heh[/quote]Well, he is encoding it. Perhaps he's a home cinematics aficionado?
:p
Posted on Reply
#11
Completely Bonkers
by: FordGT90Concept
Not to mention, it's also better at scientific applications because it has higher double precision float performance because x86 is designed to do that--not so much GPUs converted to GPGPU use.
Without doubt, x86 means quicker and cheaper re-use/factor of existing code. Great. But looking at slide 5 it doesnt look that KC is computationally much faster. Cheaper. Faster to code. But not necessarily faster to compute. I don't know the details of the Floating Point tests that were done, but it may well be that there are a set of problems where KC will clearly win (ie. double precision problems). However, most computational problems are not double-precision, and also keep in mind that KC is double-precision and not x86 (which is extended 80-bit float since the x87 days). So when moving x87 code to KC you need to be careful!
Posted on Reply
#12
FordGT90Concept
"I go fast!1!11!1!"
There is a penalty for x86 versus what is effectively machine code on other GPGPUs but Intel makes up for that with smaller fab.


In scientific applications, double precision is yearned for. The only reason why everything isn't double precision is because, until recently, graphics cards either didn't support double precision or took a huge performance penalty if they did double precision. These cards are going to end up in multi-million dollar science-conducting machines. Double precision performance is going to be a huge selling point for these cards.
Posted on Reply
#13
deleted
The most important thing about these cards is that they're normal x86 processors and don't require any sort of special consideration (although optimization is always nice). It's basically 50 Pentium Is with Sandy Bridge FPUs tacked onto them, fabbed on a 22nm process.
Posted on Reply
#15
suraswami
Is this something to do with Graphics (like video card, APU etc)?
Posted on Reply
#16
TRWOV
by: suraswami
Is this something to do with Graphics (like video card, APU etc)?
more like physics, rendering and video encoding. Also HPC (its main intended market) but that's hardly of interest for the common folk. :)

Basically it's a 50 Pentium Pro cluster in an add-on card. Can't wait for some of these to appear in retail... in a couple of years. :banghead:
Posted on Reply
#17
theoneandonlymrk
They are Now physics cards lol, its a direct stepping decendant of larrabee, and it from what i have previously seen it would take 3 sli'd or intelied or whatever to run crysis, I do want one though and its interesting to me that they run a lynux kernal on each core of it, linked virtually via regular net protocals, like a hive virually inside your rig, ill takes 3:D:cool:
Posted on Reply
#18
Jizzler
by: theoneandonlymrk
They are Now physics cards lol, its a direct stepping decendant of larrabee, and it from what i have previously seen it would take 3 sli'd or intelied or whatever to run crysis, I do want one though and its interesting to me that they run a lynux kernal on each core of it, linked virtually via regular net protocals, like a hive virually inside your rig, ill takes 3:D:cool:
Yeah, it took 8 of the slower 32-core models to do Wolfenstein:



Three or four of these new Phi's could do it, and in a couple years it'll only take one.
Posted on Reply
#19
cadaveca
My name is Dave
HOly crap that's an interesting picture!
Posted on Reply
#20
ChristTheGreat
can't wait to see the power consumption of this, Might be good for crunching :D
Posted on Reply
#21
james888
I look at this. I think its cool. I then get an "I want one" feeling. I just don't have a good use for it. I don't encode often or fold. Still though, MOAR POWA!
Posted on Reply
#22
Morgoth
i would want to get one if valve batch compile get supports up to 64 threads now it only does 16 threads
Posted on Reply
#23
deleted
by: Jizzler
Yeah, it took 8 of the slower 32-core models to do Wolfenstein:

Three or four of these new Phi's could do it, and in a couple years it'll only take one.
You forgot the part where it was raytracing the entire thing in real time. Kepler doesn't even come close to that.
Posted on Reply
#24
Steevo
A ring bus for memory controller, and a 4X1, and 2X1 setup on "streaming" scalar units? Looks like a X1xxx series to me, just grown up a bit and tweaked for processes.


Their X86 interface is the same idea behind AMD's "Fabric", one interface that handles the requests and issues them to the faster or least busy of the CPU cores, or GPU "shaders".


Larrabee, plus all the IP that AMD handed over a couple years ago to Intel as part of the monopoly payment/trade.
Posted on Reply
Add your own comment