1. Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Multi Core PI @ LINPACK

Discussion in 'Overclocking & Cooling' started by ovidiutabla, Feb 9, 2013.

  1. ovidiutabla

    ovidiutabla

    Joined:
    Feb 21, 2008
    Messages:
    40 (0.02/day)
    Thanks Received:
    12
    I developed a multithreaded CPU benchmark that calculates PI decimals using Bailey–Borwein–Plouffe formula. The benchmark is using a multithreaded algorithm written in C++ and provide excellent parallelism. Multi Core PI is written in Visual C++ using MFC and Win32API.

    How it works

    A slider will help you set the decimals of PI, from 10.000 to 100.000. Default is 80.000. Just hit Run benchmark button to start benching your CPU.

    Submit to HWBOT

    First, press Take Screenshot button. A screenshot and a XML datafile will be created. Attention! CPUZ must be running!
    Second, follow the link provided on the dialog and submit your datafile to HWBOT.

    Supported operating systems

    Microsoft Windows XP / Server 2003
    Microsoft Windows Vista / 7
    Microsoft Windows 8 / Server 2012

    Download link

    http://www.pcgamingxtreme.ro/
     
    Last edited: Feb 9, 2013
  2. HammerON

    HammerON The Watchful Moderator Staff Member

    Joined:
    Mar 2, 2009
    Messages:
    6,491 (3.19/day)
    Thanks Received:
    3,593
    My results:
    [​IMG]

    100% 12 thread utilization:)
     
    Last edited: Feb 10, 2013
    Crunching for Team TPU
  3. Aquinus

    Aquinus Resident Wat-man

    Joined:
    Jan 28, 2012
    Messages:
    6,301 (6.50/day)
    Thanks Received:
    2,092
    Location:
    Concord, NH
    :wtf: I've explained this already multiple times and people seem too ignorant to listen and you're the last person I should need to explain this to.

    Disable hyper-threading and run it again, please. :)
     
  4. HammerON

    HammerON The Watchful Moderator Staff Member

    Joined:
    Mar 2, 2009
    Messages:
    6,491 (3.19/day)
    Thanks Received:
    3,593
    [​IMG]

    Wow - that was amazing:(
    My time was increased by almost 100%.... What else would I expect when disabling HT???
     
    Crunching for Team TPU
  5. Aquinus

    Aquinus Resident Wat-man

    Joined:
    Jan 28, 2012
    Messages:
    6,301 (6.50/day)
    Thanks Received:
    2,092
    Location:
    Concord, NH
    Well that confuses me even more. I disable HT on mine and my score goes from 18.5 to 19. :|

    HT should never result in 100% improvement. There aren't the resources available to let it scale like that. That should be more like a 15-30% drop in performance on average.

    Edit: I lied that was Multi Core PRIME not MC PI, they look exactly the same sans the formula so I didn't notice it off the bat. My skepticism from PRIME worked its way over here. Either way I disabled HT and now it runs slower by about 60%. That's a bit more normal. I'm less skeptical about this benchmark and more about the prime one (unless your storing the output in a float or a double and not a fixed point number, in that case the computer is chugging for nothing). Since floating point numbers are not exact and as you go more decimals in, the precision of further decimals decreases.

    4c w/ HT:
    [​IMG]

    4c w/o HT:
    [​IMG]

    Once again is the output being verified? Can you do multiple runs per benchmark to make sure that every runs results are consistent and once again, I would like output so I can verify the benchmarks results so I can put my skepticism at ease. As it stands, something is happening on my rig and I don't know what it is or if it is right.
     

    Attached Files:

    • p8t.jpg
      p8t.jpg
      File size:
      145.6 KB
      Views:
      1,441
    • p4t.jpg
      p4t.jpg
      File size:
      146.6 KB
      Views:
      1,437
    Last edited: Feb 10, 2013
  6. uuuaaaaaa

    Joined:
    Sep 2, 2011
    Messages:
    180 (0.16/day)
    Thanks Received:
    31
    My Phenom II x6 is slow :wtf:
     

    Attached Files:

    • pHiI.png
      pHiI.png
      File size:
      150.4 KB
      Views:
      201
    Last edited: Feb 10, 2013
  7. AphexDreamer

    AphexDreamer

    Joined:
    Jun 17, 2007
    Messages:
    7,112 (2.68/day)
    Thanks Received:
    916
    Location:
    C:\Program Files (x86)\Aphexdreamer\
    Faster than my FX6100 apparently...
    [​IMG]
     
  8. Aquinus

    Aquinus Resident Wat-man

    Joined:
    Jan 28, 2012
    Messages:
    6,301 (6.50/day)
    Thanks Received:
    2,092
    Location:
    Concord, NH
    It's because the FPU is getting used for this benchmark. Keep in mind that each module only has one FPU so without FMA3 optimizations you're only going to see 3-cores worth of performance out of it. However if this used fixed point instead of floating point, this could use the integer cores which are faster in general and performances significantly better on AMD's newer processors. Fixed point also offers a higher level of precision, floating point is inaccurate because of how it converts decimals to and from base 2 integers.
     
  9. AphexDreamer

    AphexDreamer

    Joined:
    Jun 17, 2007
    Messages:
    7,112 (2.68/day)
    Thanks Received:
    916
    Location:
    C:\Program Files (x86)\Aphexdreamer\
    Which is why I had asked him if he would/could make a more FX optimized benchmark but he said it is FX optimized as it was coded with an FX processor. http://www.techpowerup.com/forums/showpost.php?p=2842045&postcount=68
     
  10. uuuaaaaaa

    Joined:
    Sep 2, 2011
    Messages:
    180 (0.16/day)
    Thanks Received:
    31
  11. Bo$$

    Bo$$ Lab Extraordinaire

    Joined:
    May 7, 2009
    Messages:
    5,311 (2.70/day)
    Thanks Received:
    867
    Location:
    London, UK
    [​IMG]

    Maybe looks a little low here
     
  12. Melvis

    Melvis

    Joined:
    Mar 18, 2008
    Messages:
    3,582 (1.50/day)
    Thanks Received:
    525
    Location:
    Australia
    :rolleyes:
     

    Attached Files:

  13. CrackerJack

    CrackerJack

    Joined:
    Dec 13, 2007
    Messages:
    2,706 (1.09/day)
    Thanks Received:
    449
    Location:
    East TN
    [​IMG]
     
  14. Arctucas

    Arctucas

    Joined:
    Jul 14, 2006
    Messages:
    1,769 (0.59/day)
    Thanks Received:
    290
    [​IMG]
     
  15. LAN_deRf_HA

    LAN_deRf_HA

    Joined:
    Apr 4, 2008
    Messages:
    4,545 (1.92/day)
    Thanks Received:
    939
    [​IMG]
     
  16. lemonadesoda

    lemonadesoda

    Joined:
    Aug 30, 2006
    Messages:
    6,252 (2.12/day)
    Thanks Received:
    963
    Great x86 kernel 5.x compatible!

    I think a REALLY USEFUL statistic would be the time / cores / GHz so that we can see the "efficiency" of the FP core!

    [​IMG]
     
  17. ovidiutabla

    ovidiutabla

    Joined:
    Feb 21, 2008
    Messages:
    40 (0.02/day)
    Thanks Received:
    12
    [​IMG]
     
  18. ovidiutabla

    ovidiutabla

    Joined:
    Feb 21, 2008
    Messages:
    40 (0.02/day)
    Thanks Received:
    12
    I have implemented encryption for the XML datafile! Now cheaters can't cheat anymore.

    Current version is 2.101

    Download link:

    http://www.pcgamingxtreme.ro/

    [​IMG]

    [​IMG]
     
  19. rickss69

    Joined:
    Aug 23, 2009
    Messages:
    2,431 (1.31/day)
    Thanks Received:
    604
    Location:
    Rockvale TN (Not Australia)
    [​IMG]
     
  20. rickss69

    Joined:
    Aug 23, 2009
    Messages:
    2,431 (1.31/day)
    Thanks Received:
    604
    Location:
    Rockvale TN (Not Australia)
    [​IMG]
     
  21. cadaveca

    cadaveca My name is Dave

    Joined:
    Apr 10, 2006
    Messages:
    13,947 (4.52/day)
    Thanks Received:
    7,069
    Location:
    Edmonton, Alberta
    PD emulates x87 entirely, hence the slowdown, IMHO. FPU doesn't matter when you aren't capable of running the instruction in the first place.
     
  22. ovidiutabla

    ovidiutabla

    Joined:
    Feb 21, 2008
    Messages:
    40 (0.02/day)
    Thanks Received:
    12
    I removed the slider.

    Default setting for benchmark is 80.000 decimals. The target is to submit to HWBOT and we have to make sure that all users are benching at the same settings [80k decimals]

    Download Link:

    www.pcgamingxtreme.ro

    [​IMG]
     
    Last edited: Feb 15, 2013
  23. Aquinus

    Aquinus Resident Wat-man

    Joined:
    Jan 28, 2012
    Messages:
    6,301 (6.50/day)
    Thanks Received:
    2,092
    Location:
    Concord, NH
    Pardon me, I know what x87 is but I don't know what you mean when you say "PD", could you clarify?
    I agree but do we know that the benchmark isn't executing x87 instructions in the first place?

    Also floating point emulation is worse than just using floating point numbers to begin with. You really need the exact value if you want your result of pi to be at all accurate. As that decimal place goes further out you're going to start losing precision.
     
  24. ovidiutabla

    ovidiutabla

    Joined:
    Feb 21, 2008
    Messages:
    40 (0.02/day)
    Thanks Received:
    12
    The application is compiled using Streaming SIMD Extensions 2 (/arch:SSE2) setting in order to replace FPU instructions with SSE code.
     
  25. Aquinus

    Aquinus Resident Wat-man

    Joined:
    Jan 28, 2012
    Messages:
    6,301 (6.50/day)
    Thanks Received:
    2,092
    Location:
    Concord, NH
    SSE still utilizes the FPU, but that answers part of my question. I'm still curious what Cadaveca meant by "PD" though.
     

Currently Active Users Viewing This Thread: 1 (0 members and 1 guest)

Share This Page