• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Multi Core PI @ LINPACK

Joined
Jul 14, 2006
Messages
2,405 (0.37/day)
Location
People's Republic of America
System Name It's just a computer
Processor i9-9900K Direct Die
Motherboard eVGA Z390 Dark
Cooling Dual D5T Vario, XSPC BayRes, Nemesis GTR560, NF-A14-iPPC3000PWM, NF-A14-iPPC2000, HK IV Pro Nickel
Memory G.Skill F4-4500C19D-16GTZKKE or G.Skill F4-3600C16D-16GTZ or G.Skill F4-4000C19D-32GTZSW
Video Card(s) eVGA RTX2080 FTW3 Ultra
Storage Samsung 960 EVO M.2
Display(s) LG 32GK650F
Case Thermaltake Xaser VI
Audio Device(s) Auzentech X-Meridian 7.1 2G/Z-5500
Power Supply Seasonic Prime PX-1300
Mouse Logitech
Keyboard Logitech
Software Win7 Ultimate x64 SP1
It's going to take a very very long time to complete with 360.000 decimals. It will complete at one time, just leave the benchmark running. It's exponential complexity. For 10k decimals its takes in 0 sec, 800ms, for 20k decimals 2 sec 900ms... and for 80k decimals 54 sec. CPU: i5 3330 @ 3Ghz, 4 cores.

I got 28 seconds, 30 ms for 80K on the previous version.

What would you estimate my time should be for 360K on the new version?

I let it run for approximately 10 minutes with no result.


EDIT:

I feel rather sheepish, I should let it run a few more seconds rather than being impatient:

 
Last edited:

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,472 (4.25/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
I got 28 seconds, 30 ms for 80K on the previous version.

What would you estimate my time should be for 360K on the new version?

I let it run for approximately 10 minutes with no result.

Just by my quick math based on the times I'm getting as I increase, you're looking at over an hour to complete 360,000 decimal places.
 
Joined
Mar 8, 2010
Messages
526 (0.10/day)
System Name Gamer - Bencher
Processor i7 5960X @5.1 GHz - load temps -5 C
Motherboard Rampage V Extreme
Cooling LD PC-V2 Phase Change - White XL Suction
Memory 16GB G.Skill Ripjaws 4 3200 MHz CL14-14-15-25 1t
Video Card(s) Titan X SLI
Storage 2x 180GB Intel 330
Display(s) Asus Swift PG278Q G-Sync
Case Lian Li PC343B-XT
Audio Device(s) Onboard
Power Supply Antec 1200W TPQ + Corsair AX1200i
Software Win 7 64 bits + Win 8.1 64 bit
MultiCorePIScreenShot.jpg


Had to try this one :)

ok for a 24/7 summer OC :toast:
 
Joined
Feb 21, 2008
Messages
40 (0.01/day)
Just by my quick math based on the times I'm getting as I increase, you're looking at over an hour to complete 360,000 decimal places.

Something like that. Just leave the benchmark running...
 
Last edited:
Joined
Jul 14, 2006
Messages
2,405 (0.37/day)
Location
People's Republic of America
System Name It's just a computer
Processor i9-9900K Direct Die
Motherboard eVGA Z390 Dark
Cooling Dual D5T Vario, XSPC BayRes, Nemesis GTR560, NF-A14-iPPC3000PWM, NF-A14-iPPC2000, HK IV Pro Nickel
Memory G.Skill F4-4500C19D-16GTZKKE or G.Skill F4-3600C16D-16GTZ or G.Skill F4-4000C19D-32GTZSW
Video Card(s) eVGA RTX2080 FTW3 Ultra
Storage Samsung 960 EVO M.2
Display(s) LG 32GK650F
Case Thermaltake Xaser VI
Audio Device(s) Auzentech X-Meridian 7.1 2G/Z-5500
Power Supply Seasonic Prime PX-1300
Mouse Logitech
Keyboard Logitech
Software Win7 Ultimate x64 SP1
So, is something wrong with result above?
 
Joined
Mar 8, 2010
Messages
526 (0.10/day)
System Name Gamer - Bencher
Processor i7 5960X @5.1 GHz - load temps -5 C
Motherboard Rampage V Extreme
Cooling LD PC-V2 Phase Change - White XL Suction
Memory 16GB G.Skill Ripjaws 4 3200 MHz CL14-14-15-25 1t
Video Card(s) Titan X SLI
Storage 2x 180GB Intel 330
Display(s) Asus Swift PG278Q G-Sync
Case Lian Li PC343B-XT
Audio Device(s) Onboard
Power Supply Antec 1200W TPQ + Corsair AX1200i
Software Win 7 64 bits + Win 8.1 64 bit
Tested 360.000 decimals with HT

mcpi.jpg
 

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,472 (4.25/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
So, is something wrong with result above?

Apparently not because it to my x6 about 20 minutes to finish.

I guess it doesn't scale exactly exponentially like I thought.
 

unclewebb

ThrottleStop & RealTemp Author
Joined
Jun 1, 2008
Messages
7,273 (1.26/day)
Thanks for the multi-threaded benchmark. :toast:

 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.96/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
I did a couple tests with my 3820 and threw the results into an OpenOffice spreadsheet to make some graphs out of it. Enjoy if anyone cares. :)

It almost looks to me as if it completes in O(n log n) time as far as how many decimals per second get calculated on average for any given decimal length but the increasing number of elements is creating a linear increase in times, so it almost feels like something O(n + n log n) or O((n + n) log n) time if I were to take a guess. I'm not really up for getting more data and doing the math to confirm my hunch. That's also for just my 3820 with 4c/8t, I'm sure it scales differently on different hardware.
pi_per_second_avg.PNG

pi_time_to_calc.PNG
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.96/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
I feel that I should also note that crunching will get my CPU up to 72-74*C but even for 360 decimals my CPU barely broke 62*C fully loaded with this. Just an observation because crunching for the same amount of time makes that much more heat despite both applications loading the CPU to 100%.
 

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,472 (4.25/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
Crunching likely uses more areas of the CPU, different instruction sets, better use of the cache, etc. because crunching is designed to be as efficient as possible. While this benchmark seem to be purposely inefficient to make the calculation take a lot longer than it should in order to get results that are more suited to a benchmark(several seconds instead of several ms).

Also, for the LOLs:

MultiCorePIScreenShot.jpg
 
Last edited:
Joined
Feb 21, 2008
Messages
40 (0.01/day)
Crunching likely uses more areas of the CPU, different instruction sets, better use of the cache, etc. because crunching is designed to be as efficient as possible. While this benchmark seem to be purposely inefficient to make the calculation take a lot longer than it should in order to get results that are more suited to a benchmark(several seconds instead of several ms).

Also, for the LOLs:

http://www.techpowerup.com/forums/attachment.php?attachmentid=51011&stc=1&d=1367677366

The benchmark is using a very complex formula to calculate decimals of PI.

Bailey–Borwein–Plouffe formula

The Bailey–Borwein–Plouffe formula (BBP formula) provides a spigot algorithm for the computation of the nth binary digit of pi (symbol: π) using base 16 math.

The formula can directly calculate the value of any given digit of π without the need to calculate the preceding digits.

The BBP is a summation-style formula that was discovered in 1995 by Simon Plouffe and was named after the authors of the paper in which the formula was published, David H. Bailey, Peter Borwein, and Simon Plouffe. Before that paper, it had been published by Plouffe on his own site.[1]

The formula is:



The algorithm is very complex, is slow, but i chose it because it's best suited for parallelization.

The whole ideea was to develop a perfect multithreaded benchmark that can make use of all the cores available, not to implement the fastest algorithm to calculate PI.

The BBP formula for π

The original BBP π summation formula was found in 1995 by Plouffe using PSLQ. It is also representable using the P function above:



which also reduces to this equivalent ratio of two polynomials:


y-cruncher is the first efficient and publicly available Pi-calculator that can sustain a near 100% cpu load on multi-core computers.

There are other multi-threaded Pi-programs that can achieve high cpu usage, but few of them can sustain it through an entire Pi computation.

Below is a typical CPU utilization graph of y-cruncher when computing 1 billion digits of Pi across 8 cores.


As of 2010, I am not aware of any Pi-program that achieves perfect parallelism for small computations and is at least half the speed of y-cruncher.

In 2013, meet Multi Core PI sire. Perfect parralelism for any number of decimals.

(It's easy to get perfect parallelism if you artificially make the task really slow.)

I did NOT artificially make the task really slow, in fact, I didn't made anything that slows down the algorithm.

Sure, the Multi Core PI algorithm was not optimized for speed but provide perfect parallelism and that was the whole ideea:

 
Last edited:

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,472 (4.25/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
Thanks for the explanation.

I wasn't knocking you, you achieved exactly what you set out to do and it makes a great benchmark.
 
Joined
Feb 21, 2008
Messages
40 (0.01/day)
Multi Core LINPACK Ultimate

Meet Multi Core LINPACK Ultimate!

A multithreaded CPU benchmark that performs numerical linear algebra. It makes use of the BLAS (Basic Linear Algebra Subprograms) libraries for performing basic vector and matrix operations.

The benchmark is written in C# / WPF [The User Interface], C++ [The Core Algorithm] and provide excellent parallelism.



How it works

Default setting for benchmark is a Matrix size of 4000. Just hit <Run benchmark> button to start benching your CPU.

Submit to HWBOT

First, press <Submit to HWBOT> button. A screenshot of the entire screen and a crypted XML datafile will be created. Attention! CPUZ must be running!
Second, follow the link provided on the dialog and submit your datafile to HWBOT.

HWBOT

http://hwbot.org/benchmark/multi_core_linpack_ultimate/

Supported operating systems

Microsoft Windows XP / Server 2003
Microsoft Windows Vista / 7
Microsoft Windows 8 / Server 2012

Website

http://www.pcgamingxtreme.ro/multi-core-linpack-ultimate/

Download Link

http://www.pcgamingxtreme.ro/forum/download/file.php?id=690
 
Joined
Jul 14, 2006
Messages
2,405 (0.37/day)
Location
People's Republic of America
System Name It's just a computer
Processor i9-9900K Direct Die
Motherboard eVGA Z390 Dark
Cooling Dual D5T Vario, XSPC BayRes, Nemesis GTR560, NF-A14-iPPC3000PWM, NF-A14-iPPC2000, HK IV Pro Nickel
Memory G.Skill F4-4500C19D-16GTZKKE or G.Skill F4-3600C16D-16GTZ or G.Skill F4-4000C19D-32GTZSW
Video Card(s) eVGA RTX2080 FTW3 Ultra
Storage Samsung 960 EVO M.2
Display(s) LG 32GK650F
Case Thermaltake Xaser VI
Audio Device(s) Auzentech X-Meridian 7.1 2G/Z-5500
Power Supply Seasonic Prime PX-1300
Mouse Logitech
Keyboard Logitech
Software Win7 Ultimate x64 SP1

Feänor

New Member
Joined
Oct 4, 2005
Messages
512 (0.08/day)
Poor little g540...
 

Attachments

  • Screenshot.png
    Screenshot.png
    128.5 KB · Views: 474
Joined
Aug 11, 2011
Messages
4,355 (0.94/day)
Location
Mexico
System Name Dell-y Driver
Processor Core i5-10400
Motherboard Asrock H410M-HVS
Cooling Intel 95w stock cooler
Memory 2x8 A-DATA 2999Mhz DDR4
Video Card(s) UHD 630
Storage 1TB WD Green M.2 - 4TB Seagate Barracuda
Display(s) Asus PA248 1920x1200 IPS
Case Dell Vostro 270S case
Audio Device(s) Onboard
Power Supply Dell 220w
Software Windows 10 64bit
Joined
Jul 2, 2010
Messages
4,015 (0.80/day)
Location
UK
System Name PC
Processor AMD Ryzen 3600
Motherboard MSI B450 Mortar Max
Cooling Phanteks PH-TC12DX, 3 x NZXT FN 140mm, 1x NZXT FV V2 120mm
Memory 32gb DDR4 3200mhz
Video Card(s) ASUS R9 290 DCII-OC 4GB
Storage corsair mp600 1TB
Display(s) LG 27MB85Z 27" 1440p
Case NZXT Source 340
Power Supply Thermaltake 675w
Mouse Logitech G500S
Keyboard Logitech G510S
Software Windows 8.1 64 bit
Poor little g540...

I suspect this benchmark might like Intel processors a little bit more than AMD. Unless I'm reading it wrong.

 
Joined
May 6, 2012
Messages
792 (0.18/day)
Location
Denmark
System Name Waterfall | xe
Processor Core i3-12100F | i5-1240P
Motherboard Gigabyte B660M-DS3H | HP laptop
Cooling Custom Watercooling | Stock laptop
Memory 2*16GB | 16GB DDR4
Video Card(s) RX 5700 with WC blocks | Iris Xe
Storage Intel 660 1TB + Crucial BX100 500GB | 1TB SSD
Display(s) U24E850R+U2515H | Internal 16"
Case Be Quiet! Dark Base 900 | Laptop
Audio Device(s) Audio 2 DJ + Xenyx Q802USB | Realtek
Power Supply Seasonic Focus Plus Gold 750W | 65W USB-C Power brick
Mouse Logitech M330
Keyboard Logitech G610 Orion Brown | Laptop
Software Gentoo + Windows 10 Pro | Gentoo
Seems to run fine on my AMD processor.

 
Joined
Aug 11, 2011
Messages
4,355 (0.94/day)
Location
Mexico
System Name Dell-y Driver
Processor Core i5-10400
Motherboard Asrock H410M-HVS
Cooling Intel 95w stock cooler
Memory 2x8 A-DATA 2999Mhz DDR4
Video Card(s) UHD 630
Storage 1TB WD Green M.2 - 4TB Seagate Barracuda
Display(s) Asus PA248 1920x1200 IPS
Case Dell Vostro 270S case
Audio Device(s) Onboard
Power Supply Dell 220w
Software Windows 10 64bit
I suspect this benchmark might like Intel processors a little bit more than AMD. Unless I'm reading it wrong.

http://img.techpowerup.org/130519/benchmark.png

It does give unconsistent results, I give you that. agent00skid's A6-3500 APU gets better times than your unlocked X4 and it's a triple-core. I thought it might be related to instruction sets but the Phenom II and Llano support the same instructions.

Maybe memory bandwidth plays a role too?


edit: Maybe your X4 is throttling? Watch the CPU-Z readout while the benchmark is running.

BTW OP, can we have a logo? Seeing the dull standard EXE icon on the desktop isn't cool.
 
Last edited:
Joined
May 6, 2012
Messages
792 (0.18/day)
Location
Denmark
System Name Waterfall | xe
Processor Core i3-12100F | i5-1240P
Motherboard Gigabyte B660M-DS3H | HP laptop
Cooling Custom Watercooling | Stock laptop
Memory 2*16GB | 16GB DDR4
Video Card(s) RX 5700 with WC blocks | Iris Xe
Storage Intel 660 1TB + Crucial BX100 500GB | 1TB SSD
Display(s) U24E850R+U2515H | Internal 16"
Case Be Quiet! Dark Base 900 | Laptop
Audio Device(s) Audio 2 DJ + Xenyx Q802USB | Realtek
Power Supply Seasonic Focus Plus Gold 750W | 65W USB-C Power brick
Mouse Logitech M330
Keyboard Logitech G610 Orion Brown | Laptop
Software Gentoo + Windows 10 Pro | Gentoo
My N830 at 1,5 Ghz in my laptop took twice as long. So on my end, it's seems to scale appropriately.
 
Joined
Jul 2, 2010
Messages
4,015 (0.80/day)
Location
UK
System Name PC
Processor AMD Ryzen 3600
Motherboard MSI B450 Mortar Max
Cooling Phanteks PH-TC12DX, 3 x NZXT FN 140mm, 1x NZXT FV V2 120mm
Memory 32gb DDR4 3200mhz
Video Card(s) ASUS R9 290 DCII-OC 4GB
Storage corsair mp600 1TB
Display(s) LG 27MB85Z 27" 1440p
Case NZXT Source 340
Power Supply Thermaltake 675w
Mouse Logitech G500S
Keyboard Logitech G510S
Software Windows 8.1 64 bit
Maybe memory bandwidth plays a role too?

BTW OP, can we have a logo? Seeing the dull standard EXE icon on the desktop isn't cool.

I'm on single channel, we should explore this.
 
Top