| Sunday, July 6 2008 |

German tech-journal Heise caught up with Intel's Pat Gelsinger for an article discussing the company's past and future as the silicon giant heads towards 40 years of service this 18th of July.
Among several topics, came up the most interesting one, visual computing and Intel's plans on it. 'Larrabee' strikes as a buzzword. It is the codename of Intel's upcoming graphics processor (GPU) architecture with which it plans to take on established players such as NVIDIA and AMD among others.
What's unique (so far) about Larrabee is that it's entirely made up of x86 processing cores. The Larrabee is likely to have 32 x86 processing cores. Here's a surprise: These processing cores are based on the design of Pentuim P54C, a 13+ year old x86 processor. This processor will be miniaturised to the 45nm fabrication process, they will be assisted by a 512-bit SIMD unit and these cores will support 64-bit address. Gelsinger says that 32 of these cores clocked at 2.00 GHz could belt out 2 TFLOPs of raw computational power. That's close to that of the upcoming AMD R700. Heise also reports that this GPU could have a TDP of as much as 300W (peak).
With inputs from Heise
Among several topics, came up the most interesting one, visual computing and Intel's plans on it. 'Larrabee' strikes as a buzzword. It is the codename of Intel's upcoming graphics processor (GPU) architecture with which it plans to take on established players such as NVIDIA and AMD among others.
What's unique (so far) about Larrabee is that it's entirely made up of x86 processing cores. The Larrabee is likely to have 32 x86 processing cores. Here's a surprise: These processing cores are based on the design of Pentuim P54C, a 13+ year old x86 processor. This processor will be miniaturised to the 45nm fabrication process, they will be assisted by a 512-bit SIMD unit and these cores will support 64-bit address. Gelsinger says that 32 of these cores clocked at 2.00 GHz could belt out 2 TFLOPs of raw computational power. That's close to that of the upcoming AMD R700. Heise also reports that this GPU could have a TDP of as much as 300W (peak).
With inputs from Heise
User comments
I want actual pics of the card...
wow 300 watts eh? Could heat the lower level of my house. I get the strange feeling this is either going to horribly flop or do incredibly well. Very little middle ground :P
by: tkpenalty;872183
I want actual pics of the card...
i'm not sure but i tough it was an Intergrated GPU on Nehalem with 2 cores + Ht :S
now i have seen this

i'm starting to get confused lol
btarunr Its spelled Larrabee
So, that 150W connector is 10-pin? (I'll eliminate 12-pin since two 6-pin connectors have a blank pin each that could be shared with pin #3)
300 watts. Can you say nuclear reactor? What happened to efficiency, Intel? :(
I don't like where GPU's are heading. There's too much power draw for so little performance increase. This goes for all GPU makers. Something needs to be done to bring power demands back in line with the rest of computer components. They have enough trouble as it is squeezing last-generation high-end GPU's in notebooks, but this is just ridiculous.
I don't like where GPU's are heading. There's too much power draw for so little performance increase. This goes for all GPU makers. Something needs to be done to bring power demands back in line with the rest of computer components. They have enough trouble as it is squeezing last-generation high-end GPU's in notebooks, but this is just ridiculous.
most wack gpu ive ever seen
whoever thought that having 32 old cpus and makign a gpu based out of it is either incredibly stupid or amazingly crafty
im not even going to bother saying much because we all know nothing matters untill we get results
STILL i know for a fact 300w is alot for gpu i mean you could run a full pc on that nearly
whoever thought that having 32 old cpus and makign a gpu based out of it is either incredibly stupid or amazingly crafty
im not even going to bother saying much because we all know nothing matters untill we get results
STILL i know for a fact 300w is alot for gpu i mean you could run a full pc on that nearly
x86 cores :twitch:
Now we can run existing software on GPU :toast:
Now we can run existing software on GPU :toast:
I'd never have to run the heater in the winter; I'll just play more games!:p
wonder if you could run programs on the gpu instead of the cpu LOL this card boggles me completely
Why so much fuss about its TDP? Wasn't the HD2900 XT like 200W (peak)?
based on the design of Pentuim P54C
Pentuim? is that some CPU i never heard of?
*cough spellcheck*
300W TDP... gack.
Pentuim? is that some CPU i never heard of?
*cough spellcheck*
300W TDP... gack.
2 TFLOPS by Larabee a year from now is nice, but I can get 2.4 TFLOPS from the Radeon 4870x2 a month from now. And the Radeon cards are already rumored to be ray tracing monsters (used for ray tracing HD scenes in Transformers) http://www.tgdaily.com/content/view/38145/135/.
Since when is a general purpose cpu going to be able to process graphics at a respectable rate?
If that was the case, everyone with a quad core would be getting 50 FPS in 3dmark with the cpu test (I don't care if it has high speed ram and cache attached or not). I'm calling intel retarded, again.
edit: Or it's more fud. Like that 10 GHz pentium 4 they just had laying around :laugh:
If that was the case, everyone with a quad core would be getting 50 FPS in 3dmark with the cpu test (I don't care if it has high speed ram and cache attached or not). I'm calling intel retarded, again.
edit: Or it's more fud. Like that 10 GHz pentium 4 they just had laying around :laugh:
by: TheGuruStud;872261If a ~70 GFLOPs Core 2 Extreme can do ~6 fps, guess what 2000 GFLOPs can.
Since when is a general purpose cpu going to be able to process graphics at a respectable rate?
If that was the case, everyone with a quad core would be getting 50 FPS in 3dmark with the cpu test (I don't care if it has high speed ram and cache attached or not). I'm calling intel retarded, again.
by: 1c3d0g;872210
I don't like where GPU's are heading. There's too much power draw for so little performance increase. This goes for all GPU makers. Something needs to be done to bring power demands back in line with the rest of computer components.
I disagree. Comparing to performance-increase, GPUs have been more efficient with every new model. Performance per Watt ratio of new GPUs is better - no matter if they need 200 watts ..
by: btarunr;872264
If a ~70 GFLOPs Core 2 Extreme can do ~6 fps, guess what 2000 GFLOPs can.
That's 100% theoretical max. I have a lot more faith in a c2e than some untested, whack design that supposed to be from old architecture. If it's new architecture or at least mostly from the ground up, then I'll be quiet. What they're claiming is just ridiculous.
To me, this is like M$ saying the xbox 360 is fast b/c it has tri-core and runs at 3.2 GHz. But in reality there's not many transistors and it just can't push much data.
intel can claim this and claim that, by the time they release the card it will already be obsolete by Nvidia and AMD, hell even Via.
the only thing this has over video cards, is the x86 architecture. that means you can effectively add 32 CPU cores to any machine. ANY app should be able to use it (games, encoding/decoding apps, etc)
by: eidairaman1;872278
intel can claim this and claim that, by the time they release the card it will already be obsolete by Nvidia and AMD, hell even Via.
VIA!!! Fastest. Stuff. Ever. :laugh:
Seriously, though, VIA was cool back in the day, but they pissed me off when the athlon 64s came out. Those boards were slow and buggy.
by: TheGuruStud;872266Just as you say you'd be silent if it was something built from scratch, you can't be loud about this either. As for scratch, these are 'old' processors, but shrunk, clocked to 2 GHz, .....etc. When something of this sort comes from Gelsinger, it's better we not jump to assumptions that it's a 'bad' architecture, since we've seen nothing to prove it's bad just as yet.
That's 100% theoretical max. I have a lot more faith in a c2e than some untested, whack design that supposed to be from old architecture. If it's new architecture or at least mostly from the ground up, then I'll be quiet. What they're claiming is just ridiculous.
To me, this is like M$ saying the xbox 360 is fast b/c it has tri-core and runs at 3.2 GHz. But in reality there's not many transistors and it just can't push much data.
by: btarunr;872290since the core architecture (core solo/duo, and then onto core 2 duo/quad) designs came from a pentium 3 tualatin, his argument really falls down anyway. old cores that hit a tech limit can really be revilatised with new tech and die shrinks.
Just as you say you'd be silent if it was something built from scratch, you can't be loud about this either. As for scratch, these are 'old' processors, but shrunk, clocked to 2 GHz, .....etc. When something of this sort comes from Gelsinger, it's better we not jump to assumptions that it's a 'bad' architecture, since we've seen nothing to prove it's bad just as yet.
WTF you call that a gpu? thats not a gpu! thats 32 p4 cores stuck together on a card with a 300w tdp after a huge ass die shrink! WTF intel.. i expected much much much more from you, come on.. 32 p4 cores stuck together to make a gpu...
A quad-core QX9770 draws 130W, isn't 32 cores @ 300W an improvement?
well 300W TDP isnt so bad... oh wait yes it is. TDP means its not max, so it could even go upto 400W real draw.
That said, this is intel. they could easily throw in some power saving features and have its power usage scale really well (modified speedstep, for example)
That said, this is intel. they could easily throw in some power saving features and have its power usage scale really well (modified speedstep, for example)
by: panchoman;872299Those are not Netburst !
WTF you call that a gpu? thats not a gpu! thats 32 p4 cores stuck together on a card with a 300w tdp after a huge ass die shrink! WTF intel.. i expected much much much more from you, come on.. 32 p4 cores stuck together to make a gpu...
by: Morgoth;872317aren't p4's nutburst?
Those are not Netburst !
by: panchoman;872324Where did you see 'P4' in the news post? That's P54C....of which Pentium Pro / MMX came up.
aren't p4's nutburst?
by: btarunr;872326oh, my bad.. wait so they're using like freaking p1's? wtf?
Where did you see 'P4' in the news post? That's P54C....of which Pentium Pro / MMX came up.
by: KieranD;872213My system only takes up about 320 watts as it stands quad core etc :eek: that will put it to 600 watt
STILL i know for a fact 300w is alot for gpu i mean you could run a full pc on that nearly
ok, the Pentium Pro, P2, P3, PM, C2, all relied on architecture for performance (P6)
486, Pentium, P4 and possibly Nehalem Rely heavily on Clock speed for performance.
486, Pentium, P4 and possibly Nehalem Rely heavily on Clock speed for performance.
Heh, its a CPU on a silicon board, period. However, 2 TFLOPs is one damn powerful CPU. The 13+ year tech is a nice touch too. Just like the GMA IGP series was based off the i740 BS, this thing is based off some dinosaur bones that I'll end up remembering if I think about it long enough - & I won't. If games will accept the x86, then this thing will fly. If not, then it will flop, badly. No middle ground at all.
by: Megasty;872366Games have no role to play in compatibility. The driver and DirectX / OpenGL take care of it.
Heh, its a CPU on a silicon board, period. However, 2 TFLOPs is one damn powerful CPU. The 13+ year tech is a nice touch too. Just like the GMA IGP series was based off the i740 BS, this thing is based off some dinosaur bones that I'll end up remembering if I think about it long enough - & I won't. If games will accept the x86, then this thing will fly. If not, then it will flop, badly. No middle ground at all.
by: btarunr;872370Great, then this have a real possibility of working but its still a huge experiment - but if it performs anywhere near the 4870x2 then it will be a ridiculous sucess. However, I'm still not counting any chickens yet :D
Games have no role to play in compatibility. The driver and DirectX / OpenGL take care of it.
by: TheGuruStud;872289I learned that the hard way!
VIA!!! Fastest. Stuff. Ever. :laugh:
Seriously, though, VIA was cool back in the day, but they pissed me off when the athlon 64s came out. Those boards were slow and buggy.
So many chipset driver problems!
by: Mussels;872294it's a pentium 3 b/c it has short pipelines? Hardly...
since the core architecture (core solo/duo, and then onto core 2 duo/quad) designs came from a pentium 3 tualatin, his argument really falls down anyway. old cores that hit a tech limit can really be revilatised with new tech and die shrinks.
I guess this is what they mean by the Ray Tracing of their cards, using CPUs. For some reason, I thought larrabee was going to be made up of x86 cpu cores and also some type of gpu core. From the looks of it, its all a computer chips on an expansion card.
No, it's 32 computer chips on a single (roughly 5cm x 5cm silicon die.).
Just like you have those 128 / 320 SP's, here, the SP is a x86 processor. It does better out-of-the-order execution,.... crunches numbers better.
Even if Larrabee fails as a GPU, it will most certainly be ported as a CPU, it will be the most powerful CPU ever made.
There's more:
If this thing is sold as a full card by Intel for say $600 (to remain competitive),
as a CPU (when ported to a central processing), even if it costs the same (sans any board, just the chip), you have the most powerful CPU for $600.....2000 TFLOPs on a desktop processor by 2009/10 howzzat?
Just like you have those 128 / 320 SP's, here, the SP is a x86 processor. It does better out-of-the-order execution,.... crunches numbers better.
Even if Larrabee fails as a GPU, it will most certainly be ported as a CPU, it will be the most powerful CPU ever made.
There's more:
If this thing is sold as a full card by Intel for say $600 (to remain competitive),
as a CPU (when ported to a central processing), even if it costs the same (sans any board, just the chip), you have the most powerful CPU for $600.....2000 TFLOPs on a desktop processor by 2009/10 howzzat?
Larrabee has 16 or 32 fully blown x86 cores. All clocked at 2Ghz. Never mind the graphics, just imagine sticking one of those babies in your PC for CUDA, PhysX, math libraries, or what have you. That card is going to outclass a PS3 at Folding@Home and match any $5000 "maths" add-in card that are used for specialist applications, at a price more like $200.
Unbelieveable power/price.
What is also in the pipeline is a mainboard with an empty socket. And you just plug in a Larrabee for extra zmog horsepower, just like those old x87 chips of yesteryear. Communication with main CPU is via Quickpath.
http://en.wikipedia.org/wiki/Larrabee_(GPU)
In fact, this is interesting. Based on the spec of each of those in-oder processors, with extra SIMD instructions, they look awfully like Intel 'Atoms'.
Perhaps this is how Intel will scale Westmere/Sandy Bridge. Rather than producing multi-versions of the CPU with core and various atom combinations, it will have just the cores. You then have an add-in socket where you can choose to add 8, 16 or 32 (or whatever) atoms as a Larrabee add-in chip.
Nice.
Unbelieveable power/price.
What is also in the pipeline is a mainboard with an empty socket. And you just plug in a Larrabee for extra zmog horsepower, just like those old x87 chips of yesteryear. Communication with main CPU is via Quickpath.
http://en.wikipedia.org/wiki/Larrabee_(GPU)
In fact, this is interesting. Based on the spec of each of those in-oder processors, with extra SIMD instructions, they look awfully like Intel 'Atoms'.
Perhaps this is how Intel will scale Westmere/Sandy Bridge. Rather than producing multi-versions of the CPU with core and various atom combinations, it will have just the cores. You then have an add-in socket where you can choose to add 8, 16 or 32 (or whatever) atoms as a Larrabee add-in chip.
Nice.
lol, they better not release it, but work their way for something more efficient...
by: 1c3d0g;872210FRIGGIN YES, i don't really get it ether, cards just eat up more and more power, but not at reasonable gains.
300 watts. Can you say nuclear reactor? What happened to efficiency, Intel? :(
I don't like where GPU's are heading. There's too much power draw for so little performance increase. This goes for all GPU makers. Something needs to be done to bring power demands back in line with the rest of computer components. They have enough trouble as it is squeezing last-generation high-end GPU's in notebooks, but this is just ridiculous.
mybe it overclocks like a cpu mybe you can get 4ghz on water :D
Wow thanks BTA and Lemonade, I didnt know that. That is going to be a beast of a card/cpu. However, I wonder if in 2009/2010, if ATI/AMD and Nvidia wont have something better.
Knowing Intel it will most likely be a EE class meaning over 1000 USD, basically consider it a Professional workstation card, not a Consumer Card.
by: btarunr;872462
No, it's 32 computer chips on a single (roughly 5cm x 5cm silicon die.).
Just like you have those 128 / 320 SP's, here, the SP is a x86 processor. It does better out-of-the-order execution,.... crunches numbers better.
Even if Larrabee fails as a GPU, it will most certainly be ported as a CPU, it will be the most powerful CPU ever made.
There's more:
If this thing is sold as a full card by Intel for say $600 (to remain competitive),
as a CPU (when ported to a central processing), even if it costs the same (sans any board, just the chip), you have the most powerful CPU for $600.....2000 TFLOPs on a desktop processor by 2009/10 howzzat?
by: Morgoth;872539
mybe it overclocks like a cpu mybe you can get 4ghz on water :D
its a distinct possibility with the shrink etc...but keep in mind this is old arch even if it hits 2ghz like intel is saying remember that that might only be because of the die shrink anything past that and we might be hitting an arch limit. but then again who knows? maybe it was never an archlimit...maybe it was a design limit the old procs were made out of like ceramic lol and incorporated 5 elements now they use half the chart and purified silicon.
by: eidairaman1;872588That would be the last thing we need. Even if this thing performs around a 4870x2, that's no reason for Intel to go crazy & charge a grand for it. Unfortunately for them, they have competition at that lvl of gfx - unlike those sick EE class processors which are in a league of their own.
Knowing Intel it will most likely be a EE class meaning over 1000 USD, basically consider it a Professional workstation card, not a Consumer Card.
by: Mussels;872237
based on the design of Pentuim P54C
Pentuim? is that some CPU i never heard of?
*cough spellcheck*
300W TDP... gack.
LOL @ the power use, yeah. I've read that this design is going to be most useful for GPGPU stuff like Physics, Folding, Video encoding, etc. It doesn't sound like it'll be best for graphics.
Pentium P54C is the Pentium 75-200 MHz. Pentium MMX (133-233 MHz) was P55C. Obviously this Larrabee chip isn't made out of 1996 Pentium CPUs. It would never clock above ~300 MHz if that were the case. They just used them as a architectural hint. Actually, the Atom CPU is based on a core from Larrabee, I think. Atom is similar to P54C too.
Atom's lack of speed vs. power use can be an indication of the potential of each Larrabee core. A Core 2 core is dramatically more powerful for most applications. Larrabee will only be fast for apps that can spread across its many cores.
by: btarunr;872264I believe you are lying under a miscomprehension.
If a ~70 GFLOPs Core 2 Extreme can do ~6 fps, guess what 2000 GFLOPs can.
Larrabee will likely still be almost powerless in games, compared to even last generation GPU's. It may have 30 cores, but guess how many ALU's each core has - that's right 1, just like any other CPU. Considering the 2 TFLOP computational power assesment, it is likely a very powerful ALU, but it would still only amount to the same amount on a GPU, which puts the Larrabee at a huge disadvantage against identically-architectured GPU's, such as the G92. It would be a lot more powerful, naturally, but just as the 800 ALU's running under the "R700"' core, it will fail at performing gaming-specific operations. That being said, it could still be great for CUDA, physics or.. just general computing. Because that's what scalar-based "CPU ALU's" are good for - everything, but they perform at it much less proficiantly.
That being said, this is fascinating to even draw up. I would want to see it in action.
Guys you gotta remember that this is Intels FIRST REAL PUSH into the dedicated graphics card market. The fact that their first GPU will be this powerful already says alot about it so relax. Intel will continue to improve on its GPUS. If its less powerful than what its competitors are when its released then it will more than likely have a lower price tag so yay for that.
I dont think that intel is expecting a whole lot from their first dedicated graphics card.
People need top stop complaining about how the intel graphics card will be using old architecture. If the old architecture works then whats to complain about? I dont have a beef as long as its not overpriced and gives a bad framerate for games. None of which we are sure on yet.
I dont think that intel is expecting a whole lot from their first dedicated graphics card.
People need top stop complaining about how the intel graphics card will be using old architecture. If the old architecture works then whats to complain about? I dont have a beef as long as its not overpriced and gives a bad framerate for games. None of which we are sure on yet.
I just learned everything about it here. Might clear up some confusion other people are having.
http://arstechnica.com/articles/paedia/hardware/clearing-up-the-confusion-over-intels-larrabee.ars
"the cores will also have a super-wide 512-bit vector FPU that's capable of processing sixteen-element floating-point vectors (single precision), along with support for control flow instructions (loops and branches) and some scalar computations."
That sounds interesting ^
http://arstechnica.com/articles/paedia/hardware/clearing-up-the-confusion-over-intels-larrabee.ars
"the cores will also have a super-wide 512-bit vector FPU that's capable of processing sixteen-element floating-point vectors (single precision), along with support for control flow instructions (loops and branches) and some scalar computations."
That sounds interesting ^
The way they keep on leaking info makes it seem they expecting alot from their first attempt.
by: effmaster;872733
Guys you gotta remember that this is Intels FIRST REAL PUSH into the dedicated graphics card market. The fact that their first GPU will be this powerful already says alot about it so relax. Intel will continue to improve on its GPUS. If its less powerful than what its competitors are when its released then it will more than likely have a lower price tag so yay for that.
I dont think that intel is expecting a whole lot from their first dedicated graphics card.
People need top stop complaining about how the intel graphics card will be using old architecture. If the old architecture works then whats to complain about? I dont have a beef as long as its not overpriced and gives a bad framerate for games. None of which we are sure on yet.
Does this mean that with intel joining in on the GPU Market that GPU's will become cheaper due to increased competition? Or not?
Hopefully. Also, hopefully, both ATi and Nvidia won't be able to lazily build minor improvements on the same type of arcitecture with Intel swimming around in the pool... there's gonna be a lot of dunking heads underwater going on :)
lol
Contrast that story with Creative. They SUED the guy who was trying to push Audigy further. Just goes to show there is far better management at Intel than Creative.
Pohl is the German computer science student behind the ray-traced versions of Quake 3 and 4 that have been featured on Digg and Slashdot. For his masters' thesis, he built a version of Quake 4 that uses real-time ray tracing to achieve some pretty remarkable effects—shadows are correctly cast and rendered in real-time, water has the proper reflections, indirect lighting looks like it's supposed to, etc. He was later hired by Intel, and now he's working within their graphics unit on real-time ray tracing for games.
you wont be able to run existing programs on it and make them run 8479483 times faster. its just like going from single core to dual core to quad core. almost no application scales from x1 to x8 or even more. yes there may be some exceptions (maybe 10 apps on the market right now in total?) but nothing that anyone here regularly uses
by: Weer;872724Not true per se. Why?
... It may have 30 cores, but guess how many ALU's each core has - that's right 1, just like any other CPU. Considering the 2 TFLOP computational power assesment, it is likely a very powerful ALU, but it would still only amount to the same amount on a GPU, which puts the Larrabee at a huge disadvantage against identically-architectured GPU's, such as the G92. It would be a lot more powerful, naturally, but just as the 800 ALU's running under the "R700"' core, it will fail at performing gaming-specific operations ...
1./ Larrabee has a much more powerful ALU that a GPU, meaning that for some tasks, Larrabee can do in one instruction what might take a fat loop and lookuptables on a GPU
2./ Larrabee ALU is DP and FP. GPU SPE is SP. To mimick DP or FP using SP requires a lot of loop and overhead
3./ SIMD on Larrabee is 512bit or more. That's the same as 16x 32bit (SP) calculations at once. With 32 x86 cores in the Larrabee matrix, that is equivalent to 16x 32cores = 512 simultantous SP calculations. ie the same as 512 shader processor units.
The key and as yet unknown data is how many clock cycles to execute SIMD compared to a GPU's SPE.
Its looking good for this intel gpu.Remember how much money intel has,loads for r+d,it has its own fabs and can write its own drivers.They also have a hell of a lot of processor manufacturing experience to fall back on.
I hope intel can sock it to the other 2,it will be good for us in the longrun,whether their first attempt is good or not.
I hope intel can sock it to the other 2,it will be good for us in the longrun,whether their first attempt is good or not.
by: swaaye;872690
Pentium P54C is the Pentium 75-200 MHz
Oh I remember when a friend of mine had a Pentium 75MHz and he had it overclocked to 90MHz and NFS (1) run on full screen! I had something 486 (edit: probably 486SX 33MHz) back then and could only run it half screen big :) I was so in awe of the overclock and the performance, remember everyone was not doing it (OC) those days.
by: OnBoard;872909Oh yeah, well I had a 486 then a pentium 233 WITH MMX! Top that sucka! :p
Oh I remember when a friend of mine had a Pentium 75MHz and he had it overclocked to 90MHz and NFS (1) run on full screen! I had something 486 (edit: probably 486SX 33MHz) back then and could only run it half screen big :) I was so in awe of the overclock and the performance, remember everyone was not doing it (OC) those days.
Anyone here interested in top500.org supercomputers?
Well, this Larrabee thing will put an end to Beowulf Class I clusters. And put a STOP to the interest in Cell blades.
Why? Much cheaper. And you wouldnt need to learn a new architecture model for programming, e.g. Cell. Just use your regular x86 IDE with Larrabee add-in.
With Larrabee we are getting 2000Gflops / 300Watt = 6000Mflops / watt, ie 10-30 times as power efficient as the best supercomputers.
That has a HUGE implication to power and cooling needed to host a number crunching monster.
It also has a HUGE implication on the cost of installing an HPC given how cheap Larrabee is compared to scaling under regular Beowulf.
With Larrabee, anyone could have an HPC if they wanted to.
Well, this Larrabee thing will put an end to Beowulf Class I clusters. And put a STOP to the interest in Cell blades.
Why? Much cheaper. And you wouldnt need to learn a new architecture model for programming, e.g. Cell. Just use your regular x86 IDE with Larrabee add-in.
Average Power consumption of a TOP10 system is 1.32 Mwatt and average power efficiency is 248 Mflop/s/Watt
With Larrabee we are getting 2000Gflops / 300Watt = 6000Mflops / watt, ie 10-30 times as power efficient as the best supercomputers.
That has a HUGE implication to power and cooling needed to host a number crunching monster.
It also has a HUGE implication on the cost of installing an HPC given how cheap Larrabee is compared to scaling under regular Beowulf.
With Larrabee, anyone could have an HPC if they wanted to.
Hey, would you be able to get a single one of the cores and then put it on a Pentium board? :D
Hopefully they'll get smart and use Pentium Pro cores instead; 512 kb of L2 cache, MMX arch., and cooler name; what could go wrong?
Also, imagine if you got a bunch of mobos with 4 PCI-E x16 lanes (I'm pretty sure they exist), stuck these cards onto a whole bunch of them (along with a quad core something), and ran a beowulf cluster? Say you had 8 motherboards, thats 4 cards per mobo, which is 32 cards, which is 64 TFLOPS!! :eek:
@ TheGuruStud: I went from a Pentium 90 to a Celeron-400! :p
Hopefully they'll get smart and use Pentium Pro cores instead; 512 kb of L2 cache, MMX arch., and cooler name; what could go wrong?
Also, imagine if you got a bunch of mobos with 4 PCI-E x16 lanes (I'm pretty sure they exist), stuck these cards onto a whole bunch of them (along with a quad core something), and ran a beowulf cluster? Say you had 8 motherboards, thats 4 cards per mobo, which is 32 cards, which is 64 TFLOPS!! :eek:
@ TheGuruStud: I went from a Pentium 90 to a Celeron-400! :p
there is more too it than this, normal cpus are much more powerfull and multipurpose, altho they should go with core2 duh, heh... P54Cs kinda suck imho.... and also there is even more to it than just the raw processing power, like the cache interfaces, and omg.. the memory interfaces, <3 2900XT/pro and 4870 for having a 512bit ring bus combined with a direct bus for low latency, honestly, p54C? they must plan on useing DDR 400mhz.. you think they mighta revamped some things?
OOPS i mean sims at 60mhz :? wow i was a whole 2 generations off.
by: TheGuruStud;872261
Since when is a general purpose cpu going to be able to process graphics at a respectable rate?
If that was the case, everyone with a quad core would be getting 50 FPS in 3dmark with the cpu test (I don't care if it has high speed ram and cache attached or not). I'm calling intel retarded, again.
edit: Or it's more fud. Like that 10 GHz pentium 4 they just had laying around :laugh:
OOPS i mean sims at 60mhz :? wow i was a whole 2 generations off.
by: Error 404;872959
@ TheGuruStud: I went from a Pentium 90 to a Celeron-400! :p
I've still got you beat :) After the 233 I got a celeron 366 and Oced to 550. The chip could do over 600, but my MB sucked.
Then I swapped it for a 600 pentium III, but ran it at stock. Piece of crap CPU just magically died one day. Then I built a new rig :) AMD 1.4 Thunderbird! And I've never looked back (upgraded to xp 2100, then a long wait until athlon 64 3500, x2 4200 and opteron 170).
Damn, way off topic. Don't hurt me.
by: Error 404;872959
Hey, would you be able to get a single one of the cores and then put it on a Pentium board? :D
No.
Hopefully they'll get smart and use Pentium Pro cores instead; 512 kb of L2 cache, MMX arch., and cooler name; what could go wrong?
Too big, too much heat, too much power and VERY little gain. Remember, these things are for crunching, not for executing long complex and branching code. MMX and SSEx are ditched in favour of specialised SIMD instructions. http://forums.techpowerup.com/showpost.php?p=872820&postcount=5
Also, imagine if you got a bunch of mobos with 4 PCI-E x16 lanes (I'm pretty sure they exist), stuck these cards onto a whole bunch of them (along with a quad core something), and ran a beowulf cluster? Say you had 8 motherboards, thats 4 cards per mobo, which is 32 cards, which is 64 TFLOPS!! :eek:
You wont need a PCEIx16 slot for these. They will probably be on PCIex1 or x4 slots. x16 not needed. Remember these things crunch... they dont need a super high bandwidth for most applications. Think of gigabit network. That bandwidth goes quite easily down a x1 slot. So you would have a gigbit bandwidth of data, representing data that had been seriously crunched to produce.
With a Larrabee, it is a cluster, but, strictly, it is not a beowulf cluster.
If you like home-made beowulfs, go here http://www.calvin.edu/~adams/research/microwulf/
http://www.intel.com/pressroom/archive/reference/IntelMulticore_factsheet.pdf
Larrabee Architecture for Visual Computing -- With plans for the first demonstrations later this year, the Larrabee architecture will be Intel's next step in evolving the visual computing platform. The Larrabee architecture includes a high-performance, wide SIMD vector processing unit (VPU) along with a new set of vector instructions including integer and floating point arithmetic, vector memory operations and conditional instructions. In addition, Larrabee includes a major new hardware coherent cache design enabling the many-core architecture. The architecture and instructions have been designed to deliver performance, energy efficiency and general purpose programmability to meet the demands of visual computing and other workloads that are inherently parallel in nature. Tools are critical to success and key Intel® Software Products will be enhanced to support the Larrabee architecture and enable unparalleled developer freedom. Industry APIs such as DirectX™ and OpenGL will be supported on Larrabee-based products.
Intel AVX: The next step in the Intel instruction set -- Gelsinger also discussed Intel AVX (Advanced Vector Extensions) which, when used by software programmers, will increase performance in floating point, media, and processor intensive software. AVX can also increase energy efficiency, and is backwards compatible to existing Intel processors. Key features include wider vectors, increasing from 128 bit to 256 bit wide, resulting in up to 2x peak FLOPs output. Enhanced data rearrangement, resulting in allowing data to be pulled more efficiently, and three operand, non-destructive syntax for a range of benefits. Intel will make the detailed specification public in early April at the Intel Developer Forum in Shanghai. The instructions will be implemented in the microarchitecture codenamed "Sandy Bridge" in the 2010 timeframe.
So, will Larrabe be adopting AVX?
by: lemonadesoda;873075Ok this gives AMD and Nvidia Time to Send in Working Pieces for Hybrid units.
http://www.intel.com/pressroom/archive/reference/IntelMulticore_factsheet.pdf
So, will Larrabe be adopting AVX?
And even more time for ray tracing, which apparently is made use of in the 4800 series cards. Intels first shot at GPUs ended miserably roughly 10 - 15 years ago. Im sure theyve learned from their mistakes back then. I for one am interested in seeing how it performs, but in the time frame given, it wont be new and cutting edge. Its a rehash. From all the information given and linked, it seems alot more complicated now than I originally though it was.
by: jyoung75;872258Yup, and that was with 1GB 2900XTs, the extra branching logic on R770 should make a big gains.
2 TFLOPS by Larabee a year from now is nice, but I can get 2.4 TFLOPS from the Radeon 4870x2 a month from now. And the Radeon cards are already rumored to be ray tracing monsters (used for ray tracing HD scenes in Transformers) http://www.tgdaily.com/content/view/38145/135/.
by: eidairaman1
Ok this gives AMD and Nvidia Time to Send in Working Pieces for Hybrid units.
I foresee nVidia integrating Via Nano or Cell cores and AMD/ATI using Thunderbirds or K6-2s.
I wonder if they will be implementing the old PowerVR tech that the Kyro series used against ATI and nVidia in the past. Hidden Surface Removal was a tech that I wished ATI and nVidia would actually steal! :) Sure ATI had their Z-buffer, and nVidia with their variant... but they simply were not as efficient as PowerVR. My Kyro2 only ran at 175MHz and it held its own fine against what ATI and nVidia had.
If this becomes something big, it will suck for nVida and AMD... and for us computer tweakers.
bryan d
If this becomes something big, it will suck for nVida and AMD... and for us computer tweakers.
bryan d
PowerVR is NEC/Panasonic, Graphics for Dreamcast were Awesome.
Woah, since when is Intel planning on entering the video card industry with something more powerful then built-in GPUs?! And what`s with the design?! You can`t just stich 32 Pentiums together and call it a GPU! nVidia and AMD are way ahead in graphics card design!
thast just LOL of GPU amd 4870 X2 has ~ 2,4Gflops and TDP under 300W (250-270 i guess)
by: substance90;873399
Woah, since when is Intel planning on entering the video card industry with something more powerful then built-in GPUs?! And what`s with the design?! You can`t just stich 32 Pentiums together and call it a GPU! nVidia and AMD are way ahead in graphics card design!
panchoman number 2 :laugh:
yes you can
Mountain House (CA) - Earlier today we learned that Intel is already heavily pitching its Larrabee technology to partners, but the technology foundation largely remains a mystery. German publication heise.de now provides more clues with a rather interesting note that Larrabee is built on Intel http://en.wikipedia.org/wiki/Intel_Corporation ’s nearly two decade-old P5 architecture.
According to Heise author Andreas Stiller, possibly the most prominent person to cover computer hardware in Germany, Intel dipped into the bin of obsolete technology (Intel’s phrase for replaced technology) to come up with a technology base for the Larrabee cGPU. While attending Intel’s 40th anniversary briefing (Intel will celebrate its 40th birthday on July 18), Stiller apparently found out that the Larrabee cores will be built on the P54C core — which was the code-name for the second-gen, 600 nm Pentium chip.
The first Pentium core (P5, 800 nm, 60 and 66 MHz) was in development since 1989 and was introduced in 1993. The P54C was launched in 1994 with speeds up to 120 MHz, while the succeeding 350 nm P54CS reached 200 MHz. The 55C core (280 nm up to 233 MHz) followed in 1995 and was replaced with the Pentium II in 1997.
Stiller added that Larrabee will debut with 32 cores that "are likely" to be equipped with MMX extensions, which would mean that Larrabee will actually be based on a modified, 45 nm P54CS core. The cores will also support 64-bit. If you count in the fact that the MMX part was replaced with a 512-bit wide AVX (Advanced Vector Extensions) unit, Stiller comes up with a theoretical performance of 32 flop/sec. per clock, topping the 2 Tflop/sec. mark at a clock speed of 2 GHz.
If this is true, then Intel may be able to hit about twice the performance in single precision calculations as Nvidia and AMD achieve today. However, both Nvidia and AMD were able to double their floating point performance between 2007 and 2008 and we have reason to believe that once Larrabee will be available, GPUs may be hitting 3 to 4 Tflop/sec. in single GPU configurations. AMD’s dual-GPU ATI Radeon 4870 X2 (clocked at 778 MHz) is estimated to hit 2.49 Tflop/sec. when it debuts within the next few weeks.
It looks like that Intel should be aiming for at least 4 Tflop/sec. for the second half of 2009.
Source: Tom's Hardware
you screwed up the quote (forgot the slash at the last one) but thanks for that.
by: Mussels;876986Dunno what you're referring to but thank you for thanking me, dude :)
you screwed up the quote (forgot the slash at the last one) but thanks for that.
