• We've upgraded our forums. Please post any issues/requests in this thread.

Trinity (Piledriver) Integer/FP Performance Higher Than Bulldozer, Clock-for-Clock

OneMoar

There is Always Moar
Joined
Apr 9, 2010
Messages
7,336
Likes
3,948
Location
Rochester area
System Name Kreij Lives On
Processor Intel Core i5 4670K @ 4.4Ghz 1.32V
Motherboard ASUS Maximus VI Gene Z87
Cooling Reeven Okeanos Single 140MM Fan +2 SP120 White's
Memory 16GB kingston hyper x @ 2133 @ 11 11 11 32
Video Card(s) EVGA GTX 1060 ACX Copper Single fan
Storage 240gb Cruical MX200SSD/WD Blue 1TB
Display(s) Samsung S24D300/HP2071D
Case Custom Full Aluminum By ST.o.CH <3
Audio Device(s) onboard
Power Supply HX 750i
Mouse Roccat KONE
Keyboard Rocatt ISKU with ISKUFX keycaps
Software Windows 10 +startisback
#76
k I am unsubbing from this thread until a mod hands out bans to the flamers here
 

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
19,184
Likes
4,793
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
#77
Joined
Mar 10, 2010
Messages
4,976
Likes
1,547
Location
Manchester uk
System Name Quad GT evo V
Processor FX8350 @ 4.8ghz1.525c NB2.64ghz Ht2.84ghz
Motherboard Gigabyte 990X Gaming
Cooling 360EK extreme 360Tt rad all push/pull, cpu,NB/Vrm blocks all EK
Memory Corsair vengeance 32Gb @1333 cas9
Video Card(s) Rx vega 64 waterblockedEK + Rx580 waterblockedEK
Storage samsung 840(250), WD 1Tb+2Tb +3Tbgrn 1tb hybrid
Display(s) Samsung uea28"850R 4k freesync, samsung 40" 1080p
Case Custom(modded) thermaltake Kandalf
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup
Power Supply corsair 1000Rmx
Mouse CM optane
Keyboard CM optane
Software Win 10 Pro
Benchmark Scores 15.69K best overall sandra so far
#78
I actually thought it had not gone too Bad, mayhap because i posted once before this:)

i cant wait for the ins and outs of whats going on to be known in a few / year ,what will the ps3 and nextbox have etc.

im looking forward to a promising Vishera, not an awe inspireing one but id imagine with the new clock mesh tech these Apu's will OC a fair bit, given a reasonable Vreg which is only going to be on the Fm2 Platform not soo much laptops but in that form i can well imagine a modern day console type gameing experience on most pc games, maybe better/ deff better then an xbox360 game , and i can see it running well with an OC.

and with L3 ,more modules/cores and a (important)later possible stepping vishera could end up doing well, I only back Amd at any point due to the fact some are OT given i get 60-80Fps in any game on ultra(addmitedly with hybrid physx for the nv favoured games) with my main rig , an intel system may well do much better but My experience isnt as bad as some of you are making out i dont notice any wait times and these chips are going to perform better then my main rig does at min or should
 
Joined
Feb 13, 2012
Messages
358
Likes
61
#79
I actually thought it had not gone too Bad, mayhap because i posted once before this:)

i cant wait for the ins and outs of whats going on to be known in a few / year ,what will the ps3 and nextbox have etc.

im looking forward to a promising Vishera, not an awe inspireing one but id imagine with the new clock mesh tech these Apu's will OC a fair bit, given a reasonable Vreg which is only going to be on the Fm2 Platform not soo much laptops but in that form i can well imagine a modern day console type gameing experience on most pc games, maybe better/ deff better then an xbox360 game , and i can see it running well with an OC.

and with L3 ,more modules/cores and a (important)later possible stepping vishera could end up doing well, I only back Amd at any point due to the fact some are OT given i get 60-80Fps in any game on ultra(addmitedly with hybrid physx for the nv favoured games) with my main rig , an intel system may well do much better but My experience isnt as bad as some of you are making out i dont notice any wait times and these chips are going to perform better then my main rig does at min or should
yes thats very true about the clocks, just imagine if a quad core piledriver can do 100watt tdp at 4.2ghz with half of the chip being a gpu clocked at 800mhz, just imagine how far can the quad core piledriver cores go without the gpu, or atleast how efficient they would be
from leaks it seems piledriver is 20% than bulldozer clock-clock so it almost finaly matches phenom II ipc, but offcourse clocks much higher.
once again the phenom/phenom II story going on, with piledriver being what bulldozer was meant to be(and some... or hopefully atleast)
 
Joined
May 18, 2010
Messages
3,416
Likes
1,049
System Name My baby
Processor Athlon II X4 620 @ 3.5GHz, 1.45v, NB @ 2700Mhz, HT @ 2700Mhz - 24hr prime95 stable
Motherboard Asus M4A785TD-V EVO
Cooling Sonic Tower Rev 2 with 120mm Akasa attached, Akasa @ Front, Xilence Red Wing 120mm @ Rear
Memory 16 GB G.Skills 1600Mhz
Video Card(s) ATI ASUS Crossfire 5850
Storage Samsung 1Tb EcoGreen hard drive
Display(s) Hanngs-G 19"
Case Antec VSK 2000 Black Tower Case
Audio Device(s) Onkyo TX-SR309 Receiver, 2x Kef Cresta 1, 1x Kef Center 20c
Power Supply OCZ StealthXstream II 600w, 4x12v/18A, 80% efficiency.
Software Windows Vista X64
#80
from leaks it seems piledriver is 20% than bulldozer clock-clock so it almost finaly matches phenom II ipc, but offcourse clocks much higher.
OK stop spreading false information - I know you didn't do it deliberately but all this false information needs to be nipped in the bud.

Phenom II is not 20% faster than Bulldozer clock for clock. It's ridiculous to think that. Obviously its application dependant, but on a good day when an application favours Phenom II's architecture we are talking about maybe 5% or less or within margin for error. Overall the Bulldozer is faster.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
10,395
Likes
5,473
Location
Concord, NH
System Name Kratos
Processor Intel Core i7 3930k @ 4.2Ghz
Motherboard ASUS P9X79 Deluxe
Cooling Zalman CPNS9900MAX 130mm
Memory G.Skill DDR3-2133, 16gb (4x4gb) @ 9-11-10-28-108-1T 1.65v
Video Card(s) MSI AMD Radeon R9 390 GAMING 8GB @ PCI-E 3.0
Storage 2x120Gb SATA3 Corsair Force GT Raid-0, 4x1Tb RAID-5, 1x500GB
Display(s) 1x LG 27UD69P (4k), 2x Dell S2340M (1080p)
Case Antec 1200
Audio Device(s) Onboard Realtek® ALC898 8-Channel High Definition Audio
Power Supply Seasonic 1000-watt 80 PLUS Platinum
Mouse Logitech G602
Keyboard Rosewill RK-9100
Software Ubuntu 17.10
Benchmark Scores Benchmarks aren't everything.
#81
OK stop spreading false information - I know you didn't do it deliberately but all this false information needs to be nipped in the bud.

Phenom II is not 20% faster than Bulldozer clock for clock. It's ridiculous to think that. Obviously its application dependant, but on a good day when an application favours Phenom II's architecture we are talking about maybe 5% or less or within margin for error. Overall the Bulldozer is faster.
He said IPC, buddy. Just because BD has a lower IPC it doesn't mean that it doesn't run faster. Once BD's IPC is down to where it was with the Phenom II, there will be a lot more performance because bulldozer clocks that much higher than the Phenom IIs did.

Also all of those tasks that do better on the P2 are single threaded tasks and unoptimized floating point applications and even in both of these cases, the performance is acceptable.
 
Joined
May 18, 2010
Messages
3,416
Likes
1,049
System Name My baby
Processor Athlon II X4 620 @ 3.5GHz, 1.45v, NB @ 2700Mhz, HT @ 2700Mhz - 24hr prime95 stable
Motherboard Asus M4A785TD-V EVO
Cooling Sonic Tower Rev 2 with 120mm Akasa attached, Akasa @ Front, Xilence Red Wing 120mm @ Rear
Memory 16 GB G.Skills 1600Mhz
Video Card(s) ATI ASUS Crossfire 5850
Storage Samsung 1Tb EcoGreen hard drive
Display(s) Hanngs-G 19"
Case Antec VSK 2000 Black Tower Case
Audio Device(s) Onkyo TX-SR309 Receiver, 2x Kef Cresta 1, 1x Kef Center 20c
Power Supply OCZ StealthXstream II 600w, 4x12v/18A, 80% efficiency.
Software Windows Vista X64
#82
He said IPC, buddy. Just because BD has a lower IPC it doesn't mean that it doesn't run faster. Once BD's IPC is down to where it was with the Phenom II, there will be a lot more performance because bulldozer clocks that much higher than the Phenom IIs did.

Also all of those tasks that do better on the P2 are single threaded tasks and unoptimized floating point applications and even in both of these cases, the performance is acceptable.
Yes, but those single threaded tasks don't do 20% better as sergionography implied.
 

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
19,184
Likes
4,793
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
#83
how about we wait for trinity to be in the hands of the reviewers and users here. Same with AM3+ Piledriver
 
Joined
Jan 20, 2010
Messages
868
Likes
160
Location
Toronto, ON. Canada
System Name Gamers PC
Processor AMD Phenom II X4 965 BE @ 3.80 GHz
Motherboard MSI 790FX-GD70 AM3
Cooling Corsair H50 Cooler
Memory Corsair XMS3 4GB (2x2GB) DDR3-1333
Video Card(s) XFX Radeon HD 5770 1GB GDDR5
Storage 2 x WD Caviar Green 1TB SATA300 w/64MB Buffer (RAID 0)
Display(s) Samsung 2494SW 1080p 24" WS LCD HD
Case CM HAF 932 Full Tower Case
Audio Device(s) Creative SB X-FI TITANIUM -PCIE x 1
Power Supply Corsair TX Series CMPSU-650TX (650W)
Software Windows 7 Ultimate 64-bit
#84
On a rather disturbing note, the performance-per-GHz figures of Piledriver are trailing far behind K12 architecture (Llano, A8-3850), let alone competitive architectures from Intel.
Each and every design is different. Piledriver/Bulldozer is design for higher clock speed. Llano and K12 is not. Just like the Athlon 64 of past it needed less clock speed to beat out Pentium 4 that needed at least an extra 1000MHz to stay competative.
 
Last edited:
Joined
Mar 13, 2012
Messages
391
Likes
90
Location
USA
#85
Quite true, both Llano and Trinity are performing @ about 2400 integer score/ GHZ, which is 20% lower than the i5 SB score.
So depending on pricing and the clocks for the lower end chips, Trinity may be fully competitive thanks to it's higher clocks and superior iGPU, especially with IB i3/pentium not coming out till Q3/Q4 and trinity coming out late Q1, early Q2.
And if the earlier rumor of power efficiency is true as well, with it being 15% more power efficient than llano, and given BD OC'd rather decently, the unlocked parts I feel based on what information we currently have available to us at the moment show a great budget part.

Although it is still merely speculation until Wiz gets to do a review.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
10,395
Likes
5,473
Location
Concord, NH
System Name Kratos
Processor Intel Core i7 3930k @ 4.2Ghz
Motherboard ASUS P9X79 Deluxe
Cooling Zalman CPNS9900MAX 130mm
Memory G.Skill DDR3-2133, 16gb (4x4gb) @ 9-11-10-28-108-1T 1.65v
Video Card(s) MSI AMD Radeon R9 390 GAMING 8GB @ PCI-E 3.0
Storage 2x120Gb SATA3 Corsair Force GT Raid-0, 4x1Tb RAID-5, 1x500GB
Display(s) 1x LG 27UD69P (4k), 2x Dell S2340M (1080p)
Case Antec 1200
Audio Device(s) Onboard Realtek® ALC898 8-Channel High Definition Audio
Power Supply Seasonic 1000-watt 80 PLUS Platinum
Mouse Logitech G602
Keyboard Rosewill RK-9100
Software Ubuntu 17.10
Benchmark Scores Benchmarks aren't everything.
#86
Yes, but those single threaded tasks don't do 20% better as sergionography implied.
At the same clock speed, I bet they did. Can we get a review of a Phenom II and BD at the same clock, HTT speeds, and memory speeds? It would answer this question very quickly.

This statement complete rubbish.
Stop trolling and only post if you have something useful to contribute.
 
Joined
Nov 4, 2005
Messages
9,946
Likes
2,309
System Name MoFo 2
Processor AMD PhenomII 1100T @ 4.2Ghz
Motherboard Asus Crosshair IV
Cooling Swiftec 655 pump, Apogee GT,, MCR360mm Rad, 1/2 loop.
Memory 8GB DDR3-2133 @ 1900 8.9.9.24 1T
Video Card(s) HD7970 1250/1750
Storage Agility 3 SSD 6TB RAID 0 on RAID Card
Display(s) 46" 1080P Toshiba LCD
Case Rosewill R6A34-BK modded (thanks to MKmods)
Audio Device(s) ATI HDMI
Power Supply 750W PC Power & Cooling modded (thanks to MKmods)
Software A lot.
Benchmark Scores Its fast. Enough.
#87
This statement complete rubbish.
Apparently you missed the chart, or lack the understanding to read it.


They directly compare the FP performance per clock, and the A8 series is raping the A10 and bulldozer if the chart is real.

In other words, AMD may have done nothing but tweaked the core a bit to conserve energy and increase the speed. All joking aside this new architecture is the P4 from AMD.


I keep thinking and saying their only saving grace will be GCN added to a quad core and software enhancement to offload the work to the much faster GPU, however I think they lack the manpower and drive to do it. So I am expecting mediocrity from their next chip after this too. Once they push for it, or pull back to a tweaked for efficency design they have a chance to gain the performance edge.


AMD, please, make software to support your hardware, Intel did it for years, programs would see a Intel chip and optimize performance, you can do it too. I would rather have two cores dedicated to serving data to a on bard GPU with enough stream processors and cache to run full tilt than have 8 cores total. Or do it in hardware, surely 10% die area is worth a exponential increase in performance.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
10,395
Likes
5,473
Location
Concord, NH
System Name Kratos
Processor Intel Core i7 3930k @ 4.2Ghz
Motherboard ASUS P9X79 Deluxe
Cooling Zalman CPNS9900MAX 130mm
Memory G.Skill DDR3-2133, 16gb (4x4gb) @ 9-11-10-28-108-1T 1.65v
Video Card(s) MSI AMD Radeon R9 390 GAMING 8GB @ PCI-E 3.0
Storage 2x120Gb SATA3 Corsair Force GT Raid-0, 4x1Tb RAID-5, 1x500GB
Display(s) 1x LG 27UD69P (4k), 2x Dell S2340M (1080p)
Case Antec 1200
Audio Device(s) Onboard Realtek® ALC898 8-Channel High Definition Audio
Power Supply Seasonic 1000-watt 80 PLUS Platinum
Mouse Logitech G602
Keyboard Rosewill RK-9100
Software Ubuntu 17.10
Benchmark Scores Benchmarks aren't everything.
#88
surely 10% die area is worth a exponential increase in performance.
Bulldozer modules scales almost linearly, hyper-threading does not. I keep telling people that single-threaded applications aren't the future, they're the past. I think people are pandering about things that won't matter in the future since nothing is optimized for BD. (Applications that use FMA3 on BD actually have sizable floating point speed improvements.)

AMD, please, make software to support your hardware, Intel did it for years, programs would see a Intel chip and optimize performance, you can do it too. I would rather have two cores dedicated to serving data to a on bard GPU with enough stream processors and cache to run full tilt than have 8 cores total.
Clearly you don't know how the development model works. Why would you prepare software for cutting edge hardware that the majority of people don't have. People use technologies when it benefits them, and it benefits software companies when people have hardware that can run their software. That means requiring something like FMA3 puts people who don't have SB or BD at a loss which only hurts the consumer and the software developer.
 
Joined
Mar 24, 2011
Messages
2,283
Likes
527
Location
Burlington, VT
Processor Intel i5-2500k
Motherboard MSI P67A-GD65
Cooling Deep Cool Gammax 400
Memory 8GB (4x2GB) G.Skill Ripjaws X DDR3-1600
Video Card(s) Gigabyte GTX 1060 Windforce OC 6GB
Storage Samsung EVO 850 256GB / WD Caviar Black 1TB
Display(s) Acer GD235HZbid 120hz LCD
Case Rosewill Challenger Mid-Tower
Audio Device(s) Onboard
Power Supply Corsair 650W 650-TX
Software Windows 10
#89
Bulldozer modules scales almost linearly, hyper-threading does not.
That's because HTing isn't intended to serve the same function. It's just there so the CPU can use previously unused resources to get some work done instead of idling. Bulldozer modules do scale well, but the problem is shit scaling linearly is still just shit. Plus it's not as though having 1,000 Cores is better than just 4 good ones for most people.

I keep telling people that single-threaded applications aren't the future, they're the past. I think people are pandering about things that won't matter in the future since nothing is optimized for BD. (Applications that use FMA3 on BD actually have sizable floating point speed improvements.)
I don't think anyone want's Single-Threaded applications, they are more like an unfortunate reality. This isn't like the Athlon X2 era when people were saying it didn't matter because only a handful of people even had Multi-Core CPU's, at this point just about everything comes with at LEAST a Dual-Core. The issue is that Bulldozer CPU's only get an edge when there are more than 4 Threads and you're using a BD CPU with more than 4 cores. Even then, having such low per-core performance usually results in the Intel CPU's winning out.
 
Joined
Mar 13, 2012
Messages
391
Likes
90
Location
USA
#90
Apparently you missed the chart, or lack the understanding to read it.


They directly compare the FP performance per clock, and the A8 series is raping the A10 and bulldozer if the chart is real.

In other words, AMD may have done nothing but tweaked the core a bit to conserve energy and increase the speed. All joking aside this new architecture is the P4 from AMD.


I keep thinking and saying their only saving grace will be GCN added to a quad core and software enhancement to offload the work to the much faster GPU, however I think they lack the manpower and drive to do it. So I am expecting mediocrity from their next chip after this too. Once they push for it, or pull back to a tweaked for efficency design they have a chance to gain the performance edge.


AMD, please, make software to support your hardware, Intel did it for years, programs would see a Intel chip and optimize performance, you can do it too. I would rather have two cores dedicated to serving data to a on bard GPU with enough stream processors and cache to run full tilt than have 8 cores total. Or do it in hardware, surely 10% die area is worth a exponential increase in performance.
Apparently you don't see the point of BD/PD architecture. The very idea behind it is going to sacrifice FP performance by sharing the FP unit between two cores. Given that it is integer performance that matters to what the architecture is being geared for, thus is what is being improved. Note that the integer performance is the same as Llano but clocks 33% higher. Giving the quad core unlocked trinity only a 10% lower integer performance than the i5-2500k while having a superior iGPU. Meaning that it -should- outperform a SB i3, and we won't see IB i3's until Q3/Q4 most likely, and therefore Trinity should hold a very good value spot for up to 6 months, and quite possibly remain competitive with the IB i3's thanks to it's unlocked variants and it's superior iGPU.

Why do I keep mentioning the iGPU? Because AMD's long term plan is HSA, and dumping floating point math onto the iGPU. And HSA functions -should- be available next year, with 22nm steamroller + GCN (expected to be possibly a 7750-equivalent) on die.

Oh, and earlier possible leaks show Trinity to be more power efficient than Llano as well.



Also @ genocide, 20% lower performance / clock and 10% lower performance / watt = shit? I see it as being less efficient and powerful, but it's not like it is only half as powerful. (and I'm saying that based on the slide btw. Given the A10 is a 100w part that is really a 95w CPU + GPU + 5w bridge chip and all. Bulldozer was something of a fail, Piledriver isn't looking to be quite as bad.

I have to agree on the appearance of a Phenom I / II again here.
 
Joined
Nov 4, 2005
Messages
9,946
Likes
2,309
System Name MoFo 2
Processor AMD PhenomII 1100T @ 4.2Ghz
Motherboard Asus Crosshair IV
Cooling Swiftec 655 pump, Apogee GT,, MCR360mm Rad, 1/2 loop.
Memory 8GB DDR3-2133 @ 1900 8.9.9.24 1T
Video Card(s) HD7970 1250/1750
Storage Agility 3 SSD 6TB RAID 0 on RAID Card
Display(s) 46" 1080P Toshiba LCD
Case Rosewill R6A34-BK modded (thanks to MKmods)
Audio Device(s) ATI HDMI
Power Supply 750W PC Power & Cooling modded (thanks to MKmods)
Software A lot.
Benchmark Scores Its fast. Enough.
#91
Bulldozer modules scales almost linearly, hyper-threading does not. I keep telling people that single-threaded applications aren't the future, they're the past. I think people are pandering about things that won't matter in the future since nothing is optimized for BD. (Applications that use FMA3 on BD actually have sizable floating point speed improvements.)



Clearly you don't know how the development model works. Why would you prepare software for cutting edge hardware that the majority of people don't have. People use technologies when it benefits them, and it benefits software companies when people have hardware that can run their software. That means requiring something like FMA3 puts people who don't have SB or BD at a loss which only hurts the consumer and the software developer.
At a linear rate of what? Adding more cores to processors doesn't directly improve performance as many threaded items are/have dependancies, so core 0 may be still processing a thread that core 1 needs the result of to start work. IPC is still extremely important, thinking otherwise is naive.

So first you say the way is multi-core, and now you say they shouldn't prepare for multi-core systems? This makes no sense, if we are moving to a multi-core standard (we are) we need to have hardware/software resources to support it, and if developers aren't going to do it, AMD needs to.
 
Joined
Mar 13, 2012
Messages
391
Likes
90
Location
USA
#92
At a linear rate of what? Adding more cores to processors doesn't directly improve performance as many threaded items are/have dependancies, so core 0 may be still processing a thread that core 1 needs the result of to start work. IPC is still extremely important, thinking otherwise is naive.

So first you say the way is multi-core, and now you say they shouldn't prepare for multi-core systems? This makes no sense, if we are moving to a multi-core standard (we are) we need to have hardware/software resources to support it, and if developers aren't going to do it, AMD needs to.
So given that it has 80-90% of the single core performance while scaling better, i would say there is an advantage. Technically it's single thread performance / watt that is important than pure IPC. Although they -usually- tend to go hand in hand vs clocks.
 
Joined
May 18, 2010
Messages
3,416
Likes
1,049
System Name My baby
Processor Athlon II X4 620 @ 3.5GHz, 1.45v, NB @ 2700Mhz, HT @ 2700Mhz - 24hr prime95 stable
Motherboard Asus M4A785TD-V EVO
Cooling Sonic Tower Rev 2 with 120mm Akasa attached, Akasa @ Front, Xilence Red Wing 120mm @ Rear
Memory 16 GB G.Skills 1600Mhz
Video Card(s) ATI ASUS Crossfire 5850
Storage Samsung 1Tb EcoGreen hard drive
Display(s) Hanngs-G 19"
Case Antec VSK 2000 Black Tower Case
Audio Device(s) Onkyo TX-SR309 Receiver, 2x Kef Cresta 1, 1x Kef Center 20c
Power Supply OCZ StealthXstream II 600w, 4x12v/18A, 80% efficiency.
Software Windows Vista X64
#93
At the same clock speed, I bet they did. Can we get a review of a Phenom II and BD at the same clock, HTT speeds, and memory speeds? It would answer this question very quickly.
I've seen many comparison reviews of the two CPUs in question, at the same clock speed, and none was anything close to 20% average increase IPC over the Bulldozer.

I'd be happy to read a review which shows that claim. If anyone has external reading material feel free to post it.
 
Joined
Feb 13, 2012
Messages
358
Likes
61
#94
Yes, but those single threaded tasks don't do 20% better as sergionography implied.
they actualy do dude, just go and look at an fx4100 review and compare it to a phenom II 980BE, you are looking an an fx8150 which clocks up to 4.2 and has 8 cores thats why it does better than a typical quad core phenom II, but comparing a quad core bulldozer to a quad core phenom II it fails miserably
only in situations were new instructions sets are supported does bulldozer hold ground,p but in typical use its way behind clock-clock, and yes by 20% if not more
phenom II does 3ipc while bulldozer does 4ipc shared between 2 cores, and because it has such a long pipeline each cycle takes a longer time(which isnt bad because its kinda designed that way so the resources can feed the second core in the module while the first one is munching on data)but things didnt go so well and the latency is worse than expected


http://www.legitreviews.com/article/1766/17/

heres some of conclusion from legitreview, i wasnt talking out of my ass just so you know

"When it comes to performance we were shocked to see the AMD A8-3850 'Llano' processor and the Socket FM1 platform performing better than the AMD FX-4100 'Bulldozer' processor and the Socket AM3+ platform. We quickly found out that the FX-4100 was priced this low as it needed to be. The performance of the FX-4100 wasn't awful, but we didn't expect to see the AMD A6-3650 running at 2.6GHz to beat the AMD FX-4100 running at 3.6GHz in benchmarks like POV-Ray and Cinebench! "
 
Joined
Nov 4, 2005
Messages
9,946
Likes
2,309
System Name MoFo 2
Processor AMD PhenomII 1100T @ 4.2Ghz
Motherboard Asus Crosshair IV
Cooling Swiftec 655 pump, Apogee GT,, MCR360mm Rad, 1/2 loop.
Memory 8GB DDR3-2133 @ 1900 8.9.9.24 1T
Video Card(s) HD7970 1250/1750
Storage Agility 3 SSD 6TB RAID 0 on RAID Card
Display(s) 46" 1080P Toshiba LCD
Case Rosewill R6A34-BK modded (thanks to MKmods)
Audio Device(s) ATI HDMI
Power Supply 750W PC Power & Cooling modded (thanks to MKmods)
Software A lot.
Benchmark Scores Its fast. Enough.
#95
So given that it has 80-90% of the single core performance while scaling better, i would say there is an advantage. Technically it's single thread performance / watt that is important than pure IPC. Although they -usually- tend to go hand in hand vs clocks.
Try between 50-90% better.

http://www.google.com/url?sa=t&rct=...sg=AFQjCNHF-2kq4OaqEGHMZBN3z0kXh3zM8A&cad=rja


Intel's own research shows a lack of increase when needing to tie up additional resources to schedule and track data between cores, and result dependency. I agree the initial result of two to four cores is a significant increase as we can offload other threads from the core running our primary worker, or assign different processes to different cores, however the overhead cost starts degrading the performance with more cores.


I was merely asking for a hardware thread handler, and if like Nvidias "hot clocks" it can run at 2 or 4 times the core speed it could easily dispatch and track resources, even handling the offload of work to the GPU cores for faster processing. I understand the unified memory and number of threads/different type of work makes it difficult, but compared to making mediocre processors blazingly fast, what downside is there? If it added 25W of heat but was only used on enthusiast grade processors I would still buy it, as would many.
 
Joined
Mar 13, 2012
Messages
391
Likes
90
Location
USA
#96
Umm BD is not 1/10th the processing power of SB, try 70-80% of the power.
That said, I'm fairly sure. Many of us would buy higher clocked chips up to 250w, and it would still sell among us. Given it's still able to be air cooled and all and most mid tower cases supporting 120mm tower coolers. Just leave a cooler out of the box, or give an option for a 155-165mm tall cooler with push-pull fans and a good design and we're set.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
10,395
Likes
5,473
Location
Concord, NH
System Name Kratos
Processor Intel Core i7 3930k @ 4.2Ghz
Motherboard ASUS P9X79 Deluxe
Cooling Zalman CPNS9900MAX 130mm
Memory G.Skill DDR3-2133, 16gb (4x4gb) @ 9-11-10-28-108-1T 1.65v
Video Card(s) MSI AMD Radeon R9 390 GAMING 8GB @ PCI-E 3.0
Storage 2x120Gb SATA3 Corsair Force GT Raid-0, 4x1Tb RAID-5, 1x500GB
Display(s) 1x LG 27UD69P (4k), 2x Dell S2340M (1080p)
Case Antec 1200
Audio Device(s) Onboard Realtek® ALC898 8-Channel High Definition Audio
Power Supply Seasonic 1000-watt 80 PLUS Platinum
Mouse Logitech G602
Keyboard Rosewill RK-9100
Software Ubuntu 17.10
Benchmark Scores Benchmarks aren't everything.
#98
That's because HTing isn't intended to serve the same function. It's just there so the CPU can use previously unused resources to get some work done instead of idling. Bulldozer modules do scale well, but the problem is shit scaling linearly is still just shit. Plus it's not as though having 1,000 Cores is better than just 4 good ones for most people.
I feel like all of my posts in this thread were just posted back to me...

I was merely asking for a hardware thread handler, and if like Nvidias "hot clocks" it can run at 2 or 4 times the core speed it could easily dispatch and track resources, even handling the offload of work to the GPU cores for faster processing. I understand the unified memory and number of threads/different type of work makes it difficult, but compared to making mediocre processors blazingly fast, what downside is there? If it added 25W of heat but was only used on enthusiast grade processors I would still buy it, as would many.
I think you're confusing how GPUs and CPUs work.

nVidia can do what they do because GPUs dispatch large workloads and runs a calculation on every shader that has data. CPUs don't work like this because you're not bulk processing the same instruction across a ton of data. You have different instructions being run, therefore what you're describing for a CPU is essentially a pipeline, which CPUs already have, but "dispatching" anything will result in less performance in single-threaded instances.

Do you know the basic 4 operations that almost any general purpose CPU does? Not to over-simplify how long a pipeline is, but basically you: LOAD, DECODE, EXECUTE, AND STORE, in that order. At this level, there is no parallelism, is's very step by step in the sense that you can't decode an instruction before you load it, you can't execute an instruction until it has been decoded, and you can't store the result after the instruction has been executed.
 
Joined
Nov 4, 2005
Messages
9,946
Likes
2,309
System Name MoFo 2
Processor AMD PhenomII 1100T @ 4.2Ghz
Motherboard Asus Crosshair IV
Cooling Swiftec 655 pump, Apogee GT,, MCR360mm Rad, 1/2 loop.
Memory 8GB DDR3-2133 @ 1900 8.9.9.24 1T
Video Card(s) HD7970 1250/1750
Storage Agility 3 SSD 6TB RAID 0 on RAID Card
Display(s) 46" 1080P Toshiba LCD
Case Rosewill R6A34-BK modded (thanks to MKmods)
Audio Device(s) ATI HDMI
Power Supply 750W PC Power & Cooling modded (thanks to MKmods)
Software A lot.
Benchmark Scores Its fast. Enough.
#99
Yes I am aware, as I am in the process of getting my degree in computer science. C++, Networking, and other classes.

A single thread on a CPU might run the four, but if we have a hardware scheduler that reads ahead and prefetches data "branching" and then performs the decode at twice the rate, programs shaders to do the work, and then they execute it and store it in the contiguous memory pool what difference does it make if the CPU transistors do it, or if the same instruction is run 5,000 times in the program, the GPU transistors do it.

Pretty simple actually, GPU's already do 90% of this work to keep up with demand. The hardest part would be resource tracking, but again, if they solve it and the performance increase is only 25% better on average, they win.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
10,395
Likes
5,473
Location
Concord, NH
System Name Kratos
Processor Intel Core i7 3930k @ 4.2Ghz
Motherboard ASUS P9X79 Deluxe
Cooling Zalman CPNS9900MAX 130mm
Memory G.Skill DDR3-2133, 16gb (4x4gb) @ 9-11-10-28-108-1T 1.65v
Video Card(s) MSI AMD Radeon R9 390 GAMING 8GB @ PCI-E 3.0
Storage 2x120Gb SATA3 Corsair Force GT Raid-0, 4x1Tb RAID-5, 1x500GB
Display(s) 1x LG 27UD69P (4k), 2x Dell S2340M (1080p)
Case Antec 1200
Audio Device(s) Onboard Realtek® ALC898 8-Channel High Definition Audio
Power Supply Seasonic 1000-watt 80 PLUS Platinum
Mouse Logitech G602
Keyboard Rosewill RK-9100
Software Ubuntu 17.10
Benchmark Scores Benchmarks aren't everything.
A single thread on a CPU might run the four, but if we have a hardware scheduler that reads ahead and prefetches data "branching" and then performs the decode at twice the rate, programs shaders to do the work, and then they execute it and store it in the contiguous memory pool what difference does it make if the CPU transistors do it, or if the same instruction is run 5,000 times in the program, the GPU transistors do it.
Except you can't process a regular application through a pipeline like a GPU has because GPU data is all the same where a computer program has multiple different instructions per clock cycle. A GPU is given a large set of data and told to do a single task to all of it, so it does it the same way. A CPU is instruction after instruction, there isn't a whole lot that represents what the GPU can do.

A shader is small because it has a limited number of instructions it can perform and has no control mechanism, no write back. There is no concept of threads in a GPU, it is an array of one or more sets of data that will have the same operation performed on the entire set. A shader is also SIMD, not MIMD as you're describing.

Where a CPU can carry out instructions like "move 10 bytes from memory location A to memory location B," A GPU does something more like "multiply every item in the array by 1.43."

Pretty simple actually, GPU's already do 90% of this work to keep up with demand. The hardest part would be resource tracking, but again, if they solve it and the performance increase is only 25% better on average, they win.
If it is so simple, why hasn't anyone else figured it out, I'm sill convinced that you don't quite know what you're talking about.

Yes I am aware, as I am in the process of getting my degree in computer science. C++, Networking, and other classes.
I do have a bachelors degree in computer science not to mention I'm employed as a systems admin and a developer.