1. Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Larrabee 2.7x faster than GT200 in SGEMM

Discussion in 'Graphics Cards' started by KainXS, Dec 2, 2009.

  1. wakkierob

    wakkierob New Member

    Joined:
    Nov 29, 2009
    Messages:
    72 (0.04/day)
    Thanks Received:
    5
    I Heared IBM made a Graphics Card for Super Computing above 2TFLOPs
    They could still use it test for speed because the hi'er the mark the paster it preforms right but then if they have tech's then they will do there own testing on them.

    I wonder what 5970 get's in your test.

    You can be a member of the Folding team and help them do there work faster the more runnning the software over the net the quicker it goes the other end right so i guess the people on it make up there super gpu lol :confused:

    I want a GPU for me and my gaming that is link like folding for cancer research goes for them then I wont need to spend all that money me thinks but the fought is nice..... :O
    Last edited: Dec 2, 2009
  2. KainXS

    KainXS

    Joined:
    Sep 25, 2007
    Messages:
    5,600 (2.24/day)
    Thanks Received:
    501
    lol
  3. phanbuey

    phanbuey

    Joined:
    Nov 13, 2007
    Messages:
    5,201 (2.12/day)
    Thanks Received:
    973
    Location:
    Miami
    [​IMG]
  4. Jstn7477

    Jstn7477

    Joined:
    Aug 30, 2009
    Messages:
    3,797 (2.11/day)
    Thanks Received:
    1,493
    Location:
    Sarasota, Florida, USA
    Lulz.
    Crunching for Team TPU More than 25k PPD
  5. phanbuey

    phanbuey

    Joined:
    Nov 13, 2007
    Messages:
    5,201 (2.12/day)
    Thanks Received:
    973
    Location:
    Miami
    That makes me wonder... does the bench software change the speed at which the algorightm/process/subroutine is performed...? i mean, those have to be some pretty awesome cores for 16 of them at 2GHz to outperform 240 shaders at 1.5ghz. Those engineers probably tweaked the hell out of that bench.


    last I checked matlab had an SGEMM bench - but i think its cpu only (i really dont know), you may be able to find a CUDA one for nvidia. IDK about ATI cards tho.
    Last edited: Dec 2, 2009
  6. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.50/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    If you read the article, 1TF was achieved with highly overclocked 32 cores, which is the maximum that Intel will have for now. 1TF is the maximum you will see for now, unless Fermi is faster and if GTX285 is doing 425, IMO Fermi can destroy Larrabee in this test. Nvidia's own tests show 4x-5x over GT200, we'll see.

    Like I said above 1TF was achieved with 32 cores and GT200 doesn't really have 240 cores, it has trully 10 cores, 24 shader "core" wide "true" cores. Larrabee has somewhat more complicated to count "cores", but for a simple comparison we could say that it has 32 "true" cores, 16 wide, so 32x16=512 "cores" compared to the 240 "cores" in GT200.
    Last edited: Dec 2, 2009
    phanbuey says thanks.
  7. phanbuey

    phanbuey

    Joined:
    Nov 13, 2007
    Messages:
    5,201 (2.12/day)
    Thanks Received:
    973
    Location:
    Miami
    [​IMG] [​IMG]

    ahhh i see it now... got confused since the shaders are always called "cores" and Intel's cores are blocks (in comparison) it has 16 vector ALU(s) per core/block. Makes sense...

    good article http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3367&p=4.
    Last edited: Dec 2, 2009
  8. mastrdrver

    mastrdrver

    Joined:
    Feb 24, 2009
    Messages:
    3,116 (1.57/day)
    Thanks Received:
    566
    How hard is coding for SGEMM? Since Intel is not known for their ability to write drivers.

    After living through Intel's last dedicated graphics experiment in the very late 90s, I have a hard time believing Larrabee will be anything for nVidia or ATI to worry about.
  9. KainXS

    KainXS

    Joined:
    Sep 25, 2007
    Messages:
    5,600 (2.24/day)
    Thanks Received:
    501
    not they any of us will have one since its for hpc market.
  10. wakkierob

    wakkierob New Member

    Joined:
    Nov 29, 2009
    Messages:
    72 (0.04/day)
    Thanks Received:
    5
    NVIDIA GeForce GTX 295
    Specifications and Features

    GPU:
    Fabrication Process: 55nm
    Processor Cores: 480
    ROP Units: 56
    Texture Filtering Units: 160
    Core Clock (MHz): 576 MHz
    Shader Clock (MHz): 1242 MHz
    Texture Filtering Rate: 92.2Giga Texels/s

    Memory:
    Memory Clock (MHz DDR): 1998 MHz
    Total Memory Config: 1792 MB
    Memory Interface Width: 448-bit per GPU
    Total Memory Bandwidth: 223.8GB/s

    surely this card would compare is is it like intel with the extreme CPU in preformance you think?

    or this look below

    The Sapphire HD 5970 OC 2GB comes equipped with a total of two RV 870 Cypress cores that have a total of 3200 Stream processors, delivering almost 5TFLOPs of processing power, making the HD 5970 the most powerful video card on the planet. Clock speeds come in a bit higher than the reference versions at 735MHz on the two Cypress cores and 1010MHz (4040MHz effective) on the 2GB of GDDR5 memory. Each core has 1GB of memory dedicated to it, running through a 512-bit bus (256x2). The specifications on paper look impressive, but that's not all; gone are the overclocking limits we have seen in the past as this card comes unlocked so you can throw the screws to it to gain some more FPS or distributed computing power. This card is designed to do some hardcore overclocking based on the construction. It features multiple Volterra voltage regulators, Japanese made pure ceramic SuperCapacitors, real time power monitoring and a programmable fan controller. The cores used are "low leakage" parts so you can get the best parts to push. With many HD 5870s hitting 1000MHz on the cores, overclocking should prove interesting.

    256x2 = 512 so it could prehaps contend gulp slurp slurp what about it!

    I mean be real in the real consumer market no one will have this type of GPU and there for off topic but mybe if ATI or Nvidia make same power processor for less money they will be force to go cheaper and then of course industries will be forced to make produce to a consumer level as well to compete.

    Over 4TFLOPs on 5970 sounds good to me even if it would be reduced on testing.....?
    Last edited: Dec 2, 2009
  11. erocker

    erocker Super Moderator Staff Member

    Joined:
    Jul 19, 2006
    Messages:
    39,535 (13.47/day)
    Thanks Received:
    13,936
    I don't understand the question. What does any of this have to do with an "extreme CPU?"

    Larabee is an entirely different archetecture compared to an Nvidia card.

    What does a GTX 295 have to do with this thread?
  12. Lionheart

    Lionheart

    Joined:
    Apr 30, 2008
    Messages:
    4,024 (1.76/day)
    Thanks Received:
    795
    Location:
    Milky Way Galaxy
    I think he meant the larrabee processor:rolleyes:
  13. KainXS

    KainXS

    Joined:
    Sep 25, 2007
    Messages:
    5,600 (2.24/day)
    Thanks Received:
    501
    someone plz tell him the difference between theoretical and actual:shadedshu
  14. wakkierob

    wakkierob New Member

    Joined:
    Nov 29, 2009
    Messages:
    72 (0.04/day)
    Thanks Received:
    5
    at the first post the guy was comparing it to GS200 and hd Graphics cards and this is the graphics card section which part don't you understand there's nothing about CPU extremes in this section!!!!

    I think someone is confused but not me

    look

    this is to do with how fast a processor dedicated to that task performs, which is to do with the graphical process of the speed of screen display FPS frames per sec OK!

    The General Matrix Multiply (GEMM) is a subroutine in the Basic Linear Algebra Subprograms (BLAS) which performs matrix multiplication, that is the multiplication of two matrices. This includes:

    SGEMM for single precision,
    DGEMM for double-precision,
    CGEMM for complex single precision, and
    ZGEMM for complex double precision

    so your telling me all these experts in Graphics benching and programing are basing there results off of theory but not practical basis and that this one test tells you your bench for the processor you describe is the only one that gives true results right.

    So there for your not just telling me that but all the people who bench there hardware with 3dmark and similar benchmarks right. Just sounds like Intel are trying to get money from you to bench Graphics processors there way.

    larrabee video card will be testeed by 3dmark and then can be compared the same way or there is no sgrument not that I'm saying this larrabee is not a monster if it were used for gaming but prove it with 3dmark or the cards not doing the same job right. So why compare it to Graphics cards that are not used for these tasks.

    What I mean is why compare it to the consumer market cards if it will be used for something totally different!
    Last edited: Dec 3, 2009
  15. KainXS

    KainXS

    Joined:
    Sep 25, 2007
    Messages:
    5,600 (2.24/day)
    Thanks Received:
    501
    it was tested against the Telsa and FireStream which are somewhat optimized for this type of work, the thing is that when you run 3DMark on a Telsa or a Firestream, they score a good bit lower sometimes than the Geforce and Radeon equivalents, I don't think this will be any different, it will probably be optimized only for single precision, not for double precision.
  16. FordGT90Concept

    FordGT90Concept "I go fast!1!11!1!"

    Joined:
    Oct 13, 2008
    Messages:
    13,350 (6.30/day)
    Thanks Received:
    3,364
    Location:
    IA, USA
    It's no different than GeForce/CUDA or Radeon/Streams. Larrabee is Larrabee--it is designed to fulfill both roles from the start. It isn't something thought of ten years after the fact and glued on. I think you underestimate the power of Larrabee in both the high performance computing and graphic segments. If one card can't do both than GeForce/Radeon must be vaporware as well. :confused:


    All graphics cards are optimized for single precision because that is almost exclusively what computing uses (games, research, and otherwise). Double precision is twice the size taking more than twice as long to compute compared to single. When time is money, single precesion is preferred.

    That doesn't mean GeForce, Radeon, and Larrabee can't do double because they sure can. It just isn't worth the performance penalty, in most cases.
    Last edited: Dec 3, 2009
    Crunching for Team TPU
  17. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.50/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Double precision is very common on research actually. 32 bit is nowhere near as precise as it is required in most scenarios, specially when working with numbers close to 0 or working with ecuations with infinites (because of the same thing, not enough precision near 0).
    phanbuey says thanks.
  18. phanbuey

    phanbuey

    Joined:
    Nov 13, 2007
    Messages:
    5,201 (2.12/day)
    Thanks Received:
    973
    Location:
    Miami
    No I understand that, but at the moment they are advertising it to the HPC segment. Larrabee is a response to CUDA and OpenCL, its definitely not bc intel wants to break into the gamer market. They're not coming out with 3dmark scores.

    Intel is coming from the other end of the spectrum... GeForce and Radeon are graphics cards that can compute, Larrabee is a compute card that can do graphics. Its primary function IMO is to stop CUDA and OpenCL from cutting into Intel's crunching pie (and a substantial pie it is).
  19. FordGT90Concept

    FordGT90Concept "I go fast!1!11!1!"

    Joined:
    Oct 13, 2008
    Messages:
    13,350 (6.30/day)
    Thanks Received:
    3,364
    Location:
    IA, USA
    Double precision allows more decimal places but, if you know the scale of the numbers you are working with, those extra decimal places are moot. In the end, they usually use a single precision float coupled with a scale for a set of numbers. Multiply the float by the scale and you got yourself performance and accuracy.

    As to infinites, single (0x7f800000) and double (positive: 0x7ff0000000000000, negative 0xfff0000000000000) have a value set aside which is flagged as "infinite."


    The video card came before the high performance computing aspects of it. There is more money in discreet video cards than cards to bulster CPU performance.

    When Intel gathered information that showed a series of x86 CPUs (that's what Intel is all about, after all) could rival the performance of a modern GPU offered by NVIDIA and AMD, the idea of Larrabee was born. From that idea came the fact that it is x86 and programmers would easily be able to use it so the GPU idea became a GPGPU idea. As proof of this, note how little information has been released about Larrabee's GPU performanc, e. Intel knows people will buy Larrabee as a GPGPU card just because it has the Intel brand on it. On the other hand, Intel knows they have to topple two corporations that have been in the segment for well over a decade. When a product, like Core 2, is quiet until release, the media frenzy sparked by a new, dominent product sells itself. That is most likely the same strategy Intel is relying on to sell Larrabee. It also explains why they are so tight-lipped about its performance as a GPU.
    Last edited: Dec 3, 2009
    Crunching for Team TPU
  20. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.50/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    You didn't understand me. I'm not talking about being able to represent those numbers. I'm talking that on the application you could get results that are very very close to zero, but no zero and because the number can't be represented it will be "rounded" to the closest number which could be either zero or a number that can be orders of magnitude bigger that the true one. If that number (the true one you are looking for) was then multiplied by a very large number, the results would differ greatly. You may expect something in the hundreds and get millions in return or zero if it was rounded to zero or simply an error (if you are multiplying with infinite (zero x infinite= not determined), while you should get infinite as the result (something finite x infinite=infinite)).

    Example: Imagine that you have two complex formulas and the result of one is (should be) something like x=0.000000000000000000000000000000001 (decimal), while the other result is y=100,000,000,000,000,000,000,000,000,000. Don't pay attention to the number of zeroes, it's just an example and I didn't count them myself. Imagine that the result x*y should be 100, but x can't be represented under 32 bits and the closest representable number is either zero or 0.0000000000011 (whatever). In either case you are not getting anything close to what you would need and this cases occur a lot in physics, astronomy, probably on genetics too...
  21. FordGT90Concept

    FordGT90Concept "I go fast!1!11!1!"

    Joined:
    Oct 13, 2008
    Messages:
    13,350 (6.30/day)
    Thanks Received:
    3,364
    Location:
    IA, USA
    You didn't understand me either. Multiple that value by 1,000,000,000 before dividing and it is no longer close to zero.
    Crunching for Team TPU
  22. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.50/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Often times you can't. For example if working with particles moving at speeds close to light speed*, but there are many other cases.

    *Because speed is going to be very high and any movement or time elapsed in any event is going to be very very small.
    Last edited: Dec 3, 2009
  23. FordGT90Concept

    FordGT90Concept "I go fast!1!11!1!"

    Joined:
    Oct 13, 2008
    Messages:
    13,350 (6.30/day)
    Thanks Received:
    3,364
    Location:
    IA, USA
    I updated an old program of mine which basically does the following...
    1) Creates one thread per core which all it does is increment a counter.
    2) Once all threads are created, it resets all counters to zero.
    3) It waits for one second, grabing and reseting the counters.
    4) After it has 10 results, it adds them together to get a cummulative value.


    The cummulative value is basically the same as taking the power of all the cores and adding them together (literally).

    The results were surprising on my Core i7 920:
    Code:
    UInt32: 2045845947.75
     Float: 633960903.875
    UInt64: 2602888497.625
    Double: 1673840629.125
    Important note: float can only increment up to 16,777,220 because it can only hold 7 significant digits (read here for an explaination). Because of this, I had to scale it, as I suggested, in order to prevent the float from overflowing. Here is a comparison of the functional code:

    Code:
    Double:
             private void Looper()
            {
                while (true)
                    _Counter++;
            }
    
    Single:
            private void Looper()
            {
                while (true)
                {
                    if (_Counter == MAX)
                    {
                        ResetCounter();
                        _Multiplier++;
                    }
    
                    _Counter++;
                }
            }
    That if statement in single apparently incurs a rather large performance penalty (much larger than anticipated).

    Moreover, I was shocked to see UInt32 and UInt64 so close.


    Anyway, here are the full results on my Core i7 920:
    Code:
    UInt scores...
    0	856077121	855505838	879774890	881755779	857463969	858833899	863758164	860493813
    1	1289681239	1289137188	1317759063	1320227048	1284711952	1285837496	1290960177	1287496641
    2	1718130750	1717533109	1760402478	1763469884	1716494840	428268464	1722603742	1715762899
    3	2145194573	2144265190	2196212534	2199489935	2140143894	849644796	2146118492	2137410750
    4	2572067760	2571340259	2639460123	2642700823	2567865980	1277313214	2573729491	2565218438
    5	3005515809	3004783299	3070568998	3074128809	2995721330	1705708878	3001469326	2993683277
    6	3437849620	3437074513	3508163167	3512091151	3422920242	2133772766	3428523633	3421769317
    7	3866929158	3866135509	3941008415	3945202232	3846848411	2565943668	3852289332	3853869102
    8	1298918	514536	75958995	80343916	4276316131	2999488674	4281647479	4287358130
    9	434706620	433895585	515246174	519847659	409302670	431268809	414545526	423649389
    
    UInt averages:
      [0]: 1932745156
      [1]: 1932018502
      [2]: 1990455483
      [3]: 1993925723
      [4]: 2351778941
      [5]: 1453608066
      [6]: 2357564536
      [7]: 2354671175
      [C]: 2045845947.75
    
    Float scores...
    0	467767013	471028154	470236515	470953884	471272208	467856302	470949616	466040867
    1	685767544	703940869	702847933	703758470	704361679	685942293	704005225	685951321
    2	903872881	936642995	935772512	936739232	937534238	904166300	937221932	904239159
    3	1121942528	1169521375	1168826497	1169903846	1170521728	1121777882	1170157453	1121841473
    4	1339682160	1402025480	1401815810	1403003370	1403576819	1340414431	1403142313	1340380991
    5	1558339844	1635052815	1634663136	1635944458	1636628036	1558528177	1636113072	1558557159
    6	1776514716	1868112676	1867570734	1868976335	1869531678	1776620293	1868951994	1776584107
    7	1994468985	2101113447	2100536195	2102048536	2102477293	1994766365	2101726272	1994791111
    8	-2082677333	-1961157796	-1961554886	-1959944080	-1959433371	-2082111967	-1960116536	-2082076823
    9	-1864264490	-1728216535	-1728646239	-1726921589	-1726566211	-1864105228	-1727222451	-1864132847
    
    Float averages:
      [0]: 590141384
      [1]: 659806348
      [2]: 659206820
      [3]: 660446246
      [4]: 660990409
      [5]: 590385484
      [6]: 660492889
      [7]: 590217651
      [C]: 633960903.875
    
    ULong scores...
    0	904870023	882782412	906855091	908835300	877603067	882806692	904901719	865663100
    1	433200245	422532902	1349360475	1351520900	1304095077	1305442712	1338295360	1292128885
    2	876886430	428063128	1790730151	1793068528	1732311369	1733579946	1782115967	1720194433
    3	1315434546	860384568	2229662721	2232480987	2163520356	2165920628	2220706211	2151294303
    4	1758937851	1294188788	2667766571	2671045230	2594642959	2599757789	2664369712	2582205782
    5	2201983567	1722958985	3100982860	3104448338	3031016719	3028581663	3107573372	3018551694
    6	2641764439	2155887417	3543216734	3546861108	3454599717	3461460181	3547480431	3441984499
    7	3078203638	2583190533	3982562676	3986694814	3890422782	3888828410	3983996800	3877703893
    8	3519860475	3013344010	4416156238	4420778929	4308030439	4318989235	4425771068	4295077269
    9	3956195612	3444488415	4851035036	4855913459	4732437643	4750175911	4862239461	4719468464
    
    ULong averages:
      [0]: 2068733682
      [1]: 1680782115
      [2]: 2883832855
      [3]: 2887164759
      [4]: 2808868012
      [5]: 2813554316
      [6]: 2883745010
      [7]: 2796427232
      [C]: 2602888497.625
    
    Double scores...
    0	546644896	546683231	543515961	544158253	547323680	546130210	547258254	545931722
    1	809585260	809687095	805208961	805995873	811062191	810660798	810996610	810332046
    2	1074791955	1074965004	1067698283	1068743733	1077134825	1078481364	1077081969	1078109357
    3	1337312671	1337587886	1330210724	1331437563	1340576512	1341548515	1340501781	1341095981
    4	1602081634	1602389372	1591677705	1593045224	1604059475	1605942770	1603989626	1605418689
    5	1868544351	1868925841	1855200968	1856709547	1870718432	1867856175	1870637400	1867194243
    6	2133873141	2134324390	2118975367	2120662494	2134172626	2132539727	2134065417	2131862886
    7	2401144388	2401656582	2383060346	2385070285	2397502809	2392659574	2397330467	2391848656
    8	265088745	2666758896	2647202755	2649279766	2661619439	2658193973	2661864083	2657280768
    9	527310383	2929093301	2910513597	2912740416	2925074924	2922969034	2926830942	2921835575
    
    Double averages:
      [0]: 1256637742
      [1]: 1737207159
      [2]: 1725326466
      [3]: 1726784315
      [4]: 1736924491
      [5]: 1735698214
      [6]: 1737055654
      [7]: 1735090992
      [C]: 1673840629.125


    My server (2 x Xeon 5310) yeilded completely different results:
    Code:
    UInt32: 1527420409.875
     Float: 1047687385.375
    UInt64: 1475289090.625
    Double: 1016090759.625
    Note how, even with the if statement in the float, 32-bit still wins.

    Here's the results on my 2 x Xeon 5310:
    Code:
    UInt scores...
    0	543384393	543013687	544768497	544519909	544522800	544533281	544285818	540900365
    1	265650138	808439622	810599932	810343134	810381446	810342297	810108109	806017030
    2	264692667	1073574076	1076314643	1075927570	1075997201	1075977705	1075827588	1069963881
    3	530064684	1339090388	1342110108	1341699536	1341336483	1341853665	1341694542	1334987233
    4	795179857	1604107089	1607797296	1607440210	1606794662	1607620876	1607247665	264819565
    5	1060658514	1869491878	1873701411	1873226779	1872575166	1873417399	1872971272	529775080
    6	1326004786	2134697430	2139486023	2138957826	2138117993	2139302568	2138762528	793030071
    7	1591253901	2400293424	2405382066	2404846882	2403992346	2405201714	2404650333	1057502201
    8	1856887154	2665726512	2671276528	2670731516	2669807772	2671038404	2670556649	1321704298
    9	265601922	2931152521	2937115627	2936649573	2935610876	2936959538	2936398432	1585186267
    
    UInt averages:
      [0]: 849937801
      [1]: 1736958662
      [2]: 1740855213
      [3]: 1740434293
      [4]: 1739913674
      [5]: 1740624744
      [6]: 1740250293
      [7]: 930388599
      [C]: 1527420409.875
    
    Float scores...
    0	343159162	348036098	348540105	348510997	344074918	348580163	348593099	343930780
    1	493878096	506850229	494871034	507867843	503400554	507959625	494931936	494508796
    2	645115541	665736598	645932044	662622949	662711755	667290541	654296194	645690071
    3	796439448	813419905	805392946	822098183	822147736	826821011	813779126	796855491
    4	947463893	972732429	964932856	981607238	981608411	986315905	973312846	947934527
    5	1098717332	1115652728	1124488142	1141161533	1141161458	1145867083	1132859483	1098958231
    6	1247419224	1274969377	1284010215	1300652748	1300701635	1305401876	1292404029	1245271473
    7	1393086009	1434256180	1434563100	1460152517	1460077137	1464874327	1451900351	1397763476
    8	1551435123	1593553813	1594091051	1619692270	1619603894	1624393784	1611428208	1551815319
    9	1702620579	1752915804	1753372528	1779228928	1779155488	1783926200	1770975219	1700461910
    
    Float averages:
      [0]: 1021933440
      [1]: 1047812316
      [2]: 1045019402
      [3]: 1062359520
      [4]: 1061464298
      [5]: 1066143051
      [6]: 1054448049
      [7]: 1022319007
      [C]: 1047687385.375
    
    ULong scores...
    0	582261138	581047369	582201357	582178112	582112193	582042596	579340111	579264767
    1	847929521	845968120	847962374	848036468	847758580	847849943	844069137	265533571
    2	1113832602	1111554441	1113869036	1113930292	1113661430	1113749831	1109327346	265696998
    3	1379749130	1377163737	1379780089	1379832068	1379571718	1379663761	265133364	531503321
    4	1645650030	1642776586	1645698454	1645742496	1645497019	1645562389	530114885	265504129
    5	1911528491	1908369163	1911585160	1911643950	1911401632	1911470279	795319950	531122548
    6	2177435194	2173983488	2177495196	2177559135	2177292264	2177379427	1060624675	265375346
    7	2443334400	2439575454	2443405099	2443468643	2443182697	2443290716	1325862209	530786524
    8	2709260475	2705192627	2709328450	2709394782	2709099442	2709214998	265404749	796515357
    9	2975155122	2970805059	2975239479	2975292551	2975013273	2975121275	264994776	265482720
    
    ULong averages:
      [0]: 1778613610
      [1]: 1775643604
      [2]: 1778656469
      [3]: 1778707849
      [4]: 1778459024
      [5]: 1778534521
      [6]: 704019120
      [7]: 429678528
      [C]: 1475289090.625
    
    Double scores...
    0	177056079	384897913	385405194	385422378	385352554	385380109	385349414	383566603
    1	354196979	561962556	562680188	562697606	562628588	562654416	562621391	560458831
    2	531241588	739032755	739949488	739916338	739873685	739930890	739887326	176653405
    3	176955390	916104858	917214970	917184176	917120035	917210228	917144454	353309889
    4	177164621	1093179967	1094478625	1094457442	1094381951	1094489572	1094425550	530191061
    5	177065725	1270238240	1271733092	1271715876	1271425179	1271757900	1271677689	706989162
    6	354206668	1447301565	1449000475	1448974223	1448582450	1449029732	1448951100	883843788
    7	531395716	1624378328	1626278472	1626252479	1625853277	1626307191	1626228075	1060740857
    8	176980301	1801452031	1803553657	1803522520	1803109961	1803578911	1803502098	1237397120
    9	354129914	1978528431	1980837375	1980803511	1980374321	1980852147	1980780394	1414101792
    
    Double averages:
      [0]: 301039298
      [1]: 1181707664
      [2]: 1183113153
      [3]: 1183094654
      [4]: 1182870200
      [5]: 1183119109
      [6]: 1183056749
      [7]: 730725250
      [C]: 1016090759.625

    Core i7 apparently doesn't take to the if statement as well as the Core 2 based processors do. The Core i7 has a very weak showing in the 32-bit float so, my conclusion is that it boils down to the hardware...


    Single precision floats are the norm for stressing a computer so I really see no problem with it. All things being the same, single should be equal to, or faster than, double precision.

    Seeing as Larrabee is using a modernized P3 core, it is impossible to speculate how its double precision performance compares to single.
    Last edited: Dec 4, 2009
    Crunching for Team TPU
  24. phanbuey

    phanbuey

    Joined:
    Nov 13, 2007
    Messages:
    5,201 (2.12/day)
    Thanks Received:
    973
    Location:
    Miami
    Its a possibility but I really dont think that this is about graphics anymore..

    http://www.datacenterknowledge.com/archives/2009/10/05/nvidias-fermi-gpu-targets-the-hpc-market/

    Even nvidia is openly admitting that GPGPU and HPC market are the primary targets for Fermi.

    And the first presentation of Larrabee intel talked about the evolution of computing... not mentioning the 'evolution of graphics'.

    [​IMG]

    this is the "supercomputing for the masses" movement and it represents a different mentality altogether, one in which the cpu is nothing more than a glorified scheduler and the GPU does all the heavy lifting. This is the gist of what I get from all the GP GPU hype, and seems like intel is positioning larrabee as such.
  25. FordGT90Concept

    FordGT90Concept "I go fast!1!11!1!"

    Joined:
    Oct 13, 2008
    Messages:
    13,350 (6.30/day)
    Thanks Received:
    3,364
    Location:
    IA, USA
    At this point, it is like asking what came first: the chicken or the egg. I'm not convinced more than 1% of the market cares about general purpose computing beyond the capabilities of the CPU. People always want bigger screens and bigger screens means higher resolutions and higher resolutions means better graphics cards. We'll see what is the driving force behind the market for discreet cards in a few years time.
    Crunching for Team TPU

Currently Active Users Viewing This Thread: 1 (0 members and 1 guest)

Share This Page