Monday, July 11th 2011

AMD FX-8130P Processor Benchmarks Surface

Here is a tasty scoop of benchmark results purported to be those of the AMD FX-8130P, the next high-end processor from the green team. The FX-8130P was paired with Gigabyte 990FXA-UD5 motherboard and 4 GB of dual-channel Kingston HyperX DDR3-2000 MHz memory running at DDR3-1866 MHz. A GeForce GTX 580 handled the graphics department. The chip was clocked at 3.20 GHz (16 x 200 MHz). Testing began with benchmarks that aren't very multi-core intensive, such as Super Pi 1M, where the chip clocked in at 19.5 seconds; AIDA64 Cache and Memory benchmark, where L1 cache seems to be extremely fast, while L2, L3, and memory performance is a slight improvement over the last generation of Phenom II processors.
Moving on to multi-threaded tests, Fritz Chess yielded a speed-up of over 29.5X over the set standard, with 14,197 kilonodes per second. x264 benchmark encoded first pass at roughly 136 fps, with roughly 45 fps in the second pass. The system scored 3045 points in PCMark7, and P6265 in 3DMark11 (performance preset). The results show that this chip will be highly competitive with Intel's LGA1155 Sandy Bridge quad-core chips, but as usual, we ask you to take the data with a pinch of salt.
Source: DonanimHaber
Add your own comment

317 Comments on AMD FX-8130P Processor Benchmarks Surface

#176
cadaveca
My name is Dave
No, I understand very well. What I do not understand is why it's important to compare 775 DDR2 performance(which launched in 2005), to AM3 DDR3 performance(circa 2009), to validate SuperPi numbers, when it's already known that SuperPi(circa 1995) is not dependant on memory performance alone?


I 110% understand the point you are trying to make. I am simply refusing to go down that road, because it serves no importance to the discussion at hand. You simply want to try to refute my postings, and slide in some doubt, but sorry, I'm not gonna fall for it. I never claimed SuperPi was only impacted by memory performance.
Posted on Reply
#177
seronx
Crap DaddySo going back to what we see in the screens posted at the beginning of this thread which sparked enthusiasm from some and skepticism from others, we can conclude that:

The Aida cache and memory benchmark is a disaster for BD, SuperPi the same and the other benchmarks are done at unknown clocks therefore we don’t have a true comparison with SB.

We’ll have to wait a little longer to realy compare BD and SB.
4.2GHz for the single core apps with modules turned off/gated
3.6GHz is the max turbo core with all cores in use
3.2GHz is the stock clock


Today was June 1st
July 31st -> August 31st

AIDA64 is a memory subsystem benchmark
SuperPi is a x87 benchmark and only really stresses L1 <-> L3 memory
the rest are basically media benchmarks
3dmark 07/11 are both gaming class benchmarks

The reason the engineer sample is not a valid way to show off Zambezi is because it isn't at spec

Zambezi 8130P ES 3.2GHz/3.6GHz/4.2GHz @ 185wTDP
Zambezi 8130P RS 3.8GHz/4.2GHz/4.8GHz @ 125wTDP
Posted on Reply
#178
Crap Daddy
So are we taking for granted what that guy says? We don't have any proof that what you are saying regarding the clocks so let me take this info with a grain of salt.
Posted on Reply
#179
THANATOS
seronx registers and memory should be doubled in Sb with HT too, decoder too, almost everything like BD except integer cluster.
FPU would be the same even if one integer cluster was removed don't you think? It would do the same, because if it was just 128b then BD wouldn't be able to work with AVX.

devguy
I know that quote, I saw it some time ago. Its down to what is a core, for most people it's probably an Integer cluster, but for me not.
Then every core diagram from AMD is wrong because they are not showing just the integer part what is a "core for most people" but also decoder, FPU, L2 cache, prediction, prefetch and so on which are not in integer cluster so they shouldn't be shown in a core diagram but in a cpu diagram with L3 cache, HTt, IMC.
Posted on Reply
#180
seronx
Crap DaddySo are we taking for granted what that guy says? We don't have any proof that what you are saying regarding the clocks so let me take this info with a grain of salt.
JF-AMD]Engineering samples are built for validation and testinghere[/URL]

His comments imply or infer that the engineer samples are lower clocked than the retail versions

There is alot of things as of current that can cripple the Engineer Samples performance

Looking at this Zambezi you can only nod...and think it only gets better from now on
KRONOSFXseronx registers and memory should be doubled in Sb with HT too, decoder too, almost everything like BD except integer cluster.
FPU would be the same even if one integer cluster was removed don't you think? It would do the same, because if it was just 128b then BD wouldn't be able to work with AVX.
The FPU will still execute 2x128bit or 1x256bit(Add+Multiply or Add or Multiply) regardless of how many cores

The FPU isn't really that tied to any core in the Bulldozer Module
Posted on Reply
#181
LAN_deRf_HA
cadavecaNo, I understand very well. What I do not understand is why it's important to compare 775 DDR2 performance(which launched in 2005), to AM3 DDR3 performance(circa 2009), to validate SuperPi numbers, when it's already known that SuperPi(circa 1995) is not dependant on memory performance alone?


I 110% understand the point you are trying to make. I am simply refusing to go down that road, because it serves no importance to the discussion at hand. You simply want to try to refute my postings, and slide in some doubt, but sorry, I'm not gonna fall for it. I never claimed SuperPi was only impacted by memory performance.
It's not. Why is this so hard for you to understand no matter how plainly I state it???!? We're talking about the freakin' cache!!! It has nothing to do with it being DDR2 or 3. Phenom II runs DDR2 you know? And Phenom II with DDR2 has better bandwidth than a top 775 proc also on DDR2 of the same speed. That's why I'm saying AMD has better bandwidth, not because of that DDR3 result. You could see that looking at the Phenom DDR2 result on the link I gave compared to the yorkfield shot I posted. I told you it had nothing to do with that, and I told you the small benefit the DDR3 made to the cache and how it made no difference. AMD still wins on the cache front as well.

You're the one that defined the importance of this. You scoffed at the poor super pi results and then went on about how it told you so much about the memory and in-turn the gaming performance. Only given the 775 comparison it would seem that it didn't really correspond.
Posted on Reply
#182
[H]@RD5TUFF
seronxJF-AMD after I asked what was cripplin' the Engineer Samples: here

His comments imply or infer that the engineer samples are lower clocked than the retail versions

There is alot of things as of current that can cripple the Engineer Samples performance

Looking at this Zambezi you can only nod...and think it only gets better from now on
if you want to go off of baseless infferance then yeah it sounds amazing, but using that as a the basis of an argument is flawed at best, not only that if that's not the case you think AMD would admit it .. . . .. NO!

All this speculation and trolling is worth less than the benchies from the eng sample.:nutkick::shadedshu
Posted on Reply
#183
XoR
if SuperPi was in SSE it wouldn't change nothing as difference here is from how well processor can re-order and parallelize such algorithms on it's executions units. Intel so happens did their homework with tweaking NetBurst which without out-of-order was nothing. AMD didn't seem to care and pays now the price.

SuperPi is not good benchmark to evaluate relative everyday performance of different CPUs but it is good benchmark to see if AMD made any progress on it's decoding and real-time optimization units.
Posted on Reply
#184
seronx
[H]@RD5TUFFif you want to go off of baseless infferance then yeah it sounds amazing, but using that as a the basis of an argument is flawed at best, not only that if that's not the case you think AMD would admit it .. . . .. NO!

All this speculation and trolling is worth less than the benchies from the eng sample.:nutkick::shadedshu
JF-AMD] Performance is based on: The silicon The microcode in the silicon The BIOS The compiler updates The drivers The OS optimizations Performance tuning by engineers [/QUOTE] This doesn't look like a defenseif SuperPi was in SSE it wouldn't change nothing as difference here is from how well processor can re-order and parallelize such algorithms on it's executions units. Intel so happens did their homework with tweaking NetBurst which without out-of-order was nothing. AMD didn't seem to care and pays now the price.

SuperPi is not good benchmark to evaluate relative everyday performance of different CPUs but it is good benchmark to see if AMD made any progress on it's decoding and real-time optimization units.
SSE performance is pretty high on AMDs

SSSE3, SSE4.1, SSE4.2, XOP, CVT16, FMA4, LWP all increase the performance of the FPU SSEs capabilities

Bulldozer is a generation leap in light speed

I'm not saying Zambezi that time because I'm talking about the architecture not the CPU
Posted on Reply
#185
cadaveca
My name is Dave
LAN_deRf_HAYou're the one that defined the importance of this. You scoffed at the poor super pi results and then went on about how it told you so much about the memory and in-turn the gaming performance. Only given the 775 comparison it would seem that it didn't really correspond.
Did Core2 CPus not have better IPC than AMD chips? Clearly the performance difference, as I've already stated, is not memory performance alone.


I mean really, going by SuperPi times alone there, my SB @ 4.9 GHz would be near 3x faster than the BD in the OP. Do I think my SB is 3x faster?

Uh, no?!?


:laugh:


It's merely one in a long list of examples where memory performance matters. Again, F1 2010 is example of a game (ie real-world) that can be impacted quite largely by memory performance...is it ONLY impacted by memory performance? NO! Are there ways to overcome that problem? You Bet!


So, I still fail to see your point, which is why I called you a troll. It's not just about cache. It's not just about memory bandwidth. It's not just about CPU core frequency. Each and every one is important when it comes to performance, and each has it's own implications and impacts on performance.

You, on the other hand, are centering on one aspect of how I have formed my opinion on what's important, while ignoring the rest.

So, now that's all said, what was your point again? Maybe your right, and I fail to understand, so why don't you just spell it out for me, please?
Posted on Reply
#186
devguy
KRONOSFXdevguy
I know that quote, I saw it some time ago. Its down to what is a core, for most people it's probably an Integer cluster, but for me not. Then every core diagram from AMD is wrong because they are not showing just the integer part what is a "core for most people" but also decoder, FPU, L2 cache, prediction, prefetch and so on which are not in integer cluster so they shouldn't be shown in a core diagram but in a cpu diagram with L3 cache, HTt, IMC.
Actually, the opposite is true. Every core diagram from AMD is right, because there is no definition of what components make up an x86 "core". They are able to apply the term as they see fit, and on what basis do you have to disagree with their call? Precedence? Personal preference? Rebelliousness? Arbitrariness? What cannot be shared if you personally would like to consider it a "core"?

To reiterate my example, why is the IMC allowed to be shared without people questioning whether it is a "core" or not? Forcing each "core" to be queued up to communicate with main memory rather than having its own direct link could marginally impact performance. Forcing each "core" in a module to share a branch predictor could marginally impact performance. Why is the first okay, and not the second?
Posted on Reply
#187
THANATOS
devguy
what you quoted was JF saying core for most people is integer cluster(ALU, AGU, INTeger scheduler and some L1D cache) yet in a core diagram regardless if architecture is BD, Phenom or Athlon they are showing not just these parts but also decoder, FPU dispatch, L2 cache, prefetch and some other parts so can you tell me how is it right and not wrong? based on this I would say these are also parts of a core and not just a small portion what JF mentioned.

I don't know why you are so hung up on IMC not being dedicated for every single core, what do you say about this, because if every core had his own IMC that would mean in a 4 core CPU every core would have just 32b bus instead of a shared IMC where if not all cores are active, one core can have 128b width and not just 1/4 and its impossible to have 128b for every single core in a 4 core cpu, that would mean 512b width memory access, look at SB-E it has just 256b memory access and they had to place two memory slots on both sides, just so it wouldn't be too complicated or expensive to manufacture.
Posted on Reply
#188
[H]@RD5TUFF
seronxThis doesn't look like a defense, lol

I don't want to bring up anything :p else that can make you say my arguments are baseless and or invalid or simply right out stupid

We know the Silicon isn't binned high
3.2GHz@185 Watts TDP come on
(I can't explain microcode :confused: is that x86?)
BIOs plagued a lot of Engineer Samples with no TC or TC2(OBR)
Compiler/drivers/OS/Performance Tuning are usually made over time
Way to debate context versus substance, my point was and remains, this is all pointless as you yourself stated engi sample are different versus consumer products.
Posted on Reply
#189
Benetanegia
devguyWhat cannot be shared if you personally would like to consider it a "core"?
This one is easy. The fetch and decode unit. That's the "thinking" part. I have 2 hands, 2 legs, 2 lungs, but only 1 head and that makes me 1 person. No matter how many pair of things I have.
To reiterate my example, why is the IMC allowed to be shared without people questioning whether it is a "core" or not? Forcing each "core" to be queued up to communicate with main memory rather than having its own direct link could marginally impact performance. Forcing each "core" in a module to share a branch predictor could marginally impact performance. Why is the first okay, and not the second?
Because a memory controler is what its name implies, a memory controler, which has little to do with what a CPU really is.

All CPU architectures are based on Von Nemann's design which specified, a CPU, main memory and i/o module. The 3 are separate things, whether they are included in the same package or not.

Now CPUs have an integrated memory controler, but that does not make it part of the CPU really, it makes them part of the CPU die. We can say the same about high level caches actually, they are on die, but they are NOT the CPU, nor are they part of the CPU.

Or what's next? We will call a core to every GPU shader processor on die, because they have an ALU? pff ridiculous.
Posted on Reply
#190
THANATOS
Benetanegia you have my thanks:), but instead of cpu maybe you could have used cores or something because memory controller is in a cpu so it wouldn't be confusing.


P.S. forget my comment except my thanks:), I am just too sleepy so I didn't grasp right the meaning of some words.
Posted on Reply
#191
seronx
[H]@RD5TUFFWay to debate context versus substance, my point was and remains, this is all pointless as you yourself stated engi sample are different versus consumer products.
It is not pointless
Performance increase from here on but some people want to know how much and some of us can help with that

I shot out a number
10% to 30% very modest to me
As the Engineer Sample is already good enough for me
BenetanegiaOr what's next? We will call a core to every GPU shader processor on die, because they have an ALU? pff ridiculous.
*cough*Fermi*cough* *cough*16 cores*cough* *cough*512 ALUs*cough*

*cough*Northern Islands*cough* *cough*384 cores*cough* *cough*1536 ALUs*cough*

More or so the AMD GPU than the Nvidia GPU

It's already happening oh noes
Posted on Reply
#192
[H]@RD5TUFF
seronxIt is not pointless
Performance increase from here on but some people want to know how much and some of us can help with that

I shot out a number
10% to 30% they very modest to me
As the Engineer Sample is already good enough for me
It is pointless as where are you getting those numbers from ? And don't say "I can't tell you", the first rule of the internet if you can't prove it don't post it cuz it's wrong. Also again AMD says engi samples are less than consumer, but amd also inflates it's own numbers just like Intel, I am saying even if it's true what if the opposite is and would AMD admit it . .. NO. So I do not know how else to help you understand that, but if you still don't get it sorry.
Posted on Reply
#193
LAN_deRf_HA
cadavecaDid Core2 CPus not have better IPC than AMD chips? Clearly the performance difference, as I've already stated, is not memory performance alone.


I mean really, going by SuperPi times alone there, my SB @ 4.9 GHz would be near 3x faster than the BD in the OP. Do I think my SB is 3x faster?

Uh, no?!?


:laugh:


It's merely one in a long list of examples where memory performance matters. Again, F1 2010 is example of a game (ie real-world) that can be impacted quite largely by memory performance...is it ONLY impacted by memory performance? NO! Are there ways to overcome that problem? You Bet!


So, I still fail to see your point, which is why I called you a troll. It's not just about cache. It's not just about memory bandwidth. It's not just about CPU core frequency. Each and every one is important when it comes to performance, and each has it's own implications and impacts on performance.

You, on the other hand, are centering on one aspect of how I have formed my opinion on what's important, while ignoring the rest.

So, now that's all said, what was your point again? Maybe your right, and I fail to understand, so why don't you just spell it out for me, please?
Nah, let's not play that game again. Let's talk about your point.

Yes it's more than memory performance. It's the architecture overall. You act like because Intel's architecture favors super pi it favors all games. It does not. There are games that favor AMDs architecture as well. Because of this you shouldn't be focusing on super pi as any sort of performance indicator across platforms. A far better question at this point is just wth was your point supposed to be? You start by saying "Wake me up when AMD can reach these." Putting the utmost emphasis on a test that as it turns out has no bearing on the overall gaming performance you care so much about. Then you proceed to gradually back track and down play that initial stance increasingly as we move on through the thread, while expertly misunderstanding what I was saying. Now you get what I mean and you decide to move on to talking about the IPC. It's not even about what you're arguing as much as it is about not appearing to be wrong is it? Arguing with you has been like looking at a funhouse mirror. Doesn't matter what the input is everything you get back is all wonky.

Let me try to explain what’s happening here. I feel you have trouble expression your very rigidly held opinions. A lot of the things you say come off as confusing and poorly defined. These are things I don’t respond well too. Then you laugh at people and call them trolls when your confusing statements are challenged. Jackassery is something I don’t respond well to either. Both of those together make me very unpleasant. Frankly I don’t think anyone should be expected to be pleasant in the face of that. So like with the SB overclocking thread I think I’ll just stop visiting this.
Posted on Reply
#194
seronx
[H]@RD5TUFFIt is pointless as where are you getting those numbers from ? And don't say "I can't tell you", the first rule of the internet if you can't prove it don't post it cuz it's wrong. Also again AMD says engi samples are less than consumer, but amd also inflates it's own numbers just like Intel, I am saying even if it's true what if the opposite is and would AMD admit it . .. NO. So I do not know how else to help you understand that, but if you still don't get it sorry.
Wrong,

The first rule of the internet is

1. Don't annoy someone who has more spare time than you do.

AMD hasn't inflated any numbers

They have only said CMT is a more efficient way of what SMT tries to achieve

SMT = more threads on the die without increasing the amount of cores

CMT = more cores on the die with a 50% die increase from Phenom II

4 x 150 = 600%
6 x 100 = 600%

So, Bulldozer is about the same die size as Thuban while achieving relatively the same performance per core while having more cores
Posted on Reply
#195
GenTarkin
This is to cadaveca or anyone else with this mindset:

Why do you rely so heavily game results / benchmarks determine your chosen platform?
The hilarious part is, over half the gaming reviews / benchmarks published are pure BS.
Heres how it breaks down in the end:
Intel - sure, you get amazing fps on lower resolution settings, you get good fps on more normal resolution and maybe a bit of eyecandy turned on. But when it comes down to the actual meat of what matters in a game (min fps)...AMD and Intel are VERY VERY close. Sure, intel still outreaches AMD in some games, in min fps at decent settings....but in the end. Intel vs AMD - gaming....um...pretty much even when you take into account whats important (min fps).

All these benches showing highest fps or even avg fps, to a lesser extent, are nearly meaning less...because highest fps and avg fps are most likely ALWAYS at a decent playable fps. Whereas min fps may not always be so playable. So who does best when shit hits the fan is the winner...the problem is, there is hardly any best in this case. They are nearly tied in most cases.

The only time where it may mean otherwise is if your GPU is so powerful that any resolution and any hardcore graphics settings, peg your CPU @ 100% - therefore bottlenecking your GPU (especially if the GPU show significant usage less than 100%).

So, really when it comes down to it...AMD vs Intel... both fast enough to handle nearly any amount of GPU power available today, now stop arguing over it!
Posted on Reply
#196
Benetanegia
seronx*cough*Fermi*cough* *cough*16 cores*cough* *cough*512 ALUs*cough*

*cough*Northern Islands*cough* *cough*384 cores*cough* *cough*1536 ALUs*cough*

More or so the AMD GPU than the Fermi GPU
What's up with all those cough? You are actually giving me reason.

A GPU core is not the same as a CPU core, so it's pointless to make any argument from that. When it comes to functionality, yes GF100/110 (and not Fermi*) has 16 cores (looking at it from a compute perspective) with 32 ALU each. In reality it has 2 SIMD ALUs each. And this is good BTW. Why on earth would you say that GF110 has 16 cores and 512 ALUs, when in fact each "core" has two parallel and totally independen execution units (SIMDs)? Why not say that it has 32 "cores", 16 modules? Because Nvidia chose not to claim that?

And Cayman has 24 "cores" not 384.

* Fermi can have 16, 8, 4... of so called "cores" (GF100, GF104, GF106...). I never call them cores anyway. Not even Nvidia calls them cores, as in GPU cores. They call them CUDA cores, and when it comes to CUDA execution, they are CUDA cores in many ways. In that each one can take care of 1 cuda thread.
Posted on Reply
#197
seronx
BenetanegiaAnd Cayman has 24 "cores" not 384.
You've blown my mind explain but other than that

Zambezi is a(n) native 4/6/8 core processor because it has the basic components to be called a(n) 4/6/8 core processor

My point is that most companies base the "core" amount on how much ALUs they have or how many executions possible the ALU can fart out
Posted on Reply
#198
XoR
seronxSSE performance is pretty high on AMDs
SSE is not magic and it won't solve every performance issues a cpu have. It's important how floating point execution unit work and how good uOPS decoder can throw at them. And it combined together matters more than if instruction is x87 variety or SSE's.

if so happen AMD ditched x87 and made it slow in favor of SSE then SSE versions of SuperPi floating around the net should show than difference amd vs intel should be lower. Is it any lower? Or SSE performance of Intel CPUs is also "pretty high"? :laugh:
seronxSSSE3, SSE4.1, SSE4.2, XOP, CVT16, FMA4, LWP all increase the performance of the FPU SSEs capabilities

Bulldozer is a generation leap in light speed
Lacking such obvious extension like SSSE3 in 2011 AMD CPU is quite troubling and Bulldozer will fix that which is good for Intel CPU also :p
Posted on Reply
#199
Benetanegia
seronxYou've blown my mind explain
They are 24 "cores" which are composed of a 16 SP wide SIMD unit. Then each SP on each SIMD has 4 "ALUs".

24 x 16= 384
384 x 4 = 1536
My point is that most companies base the "core" amount on how much ALUs they have or how many executions possible the ALU can fart out
No they don't and if they did, they shouldn't. Each CPU core since the superscalar desing was implemented a loooooooooong time ago has more than 1 ALU per core. So 1 ALU could never be a core.
Posted on Reply
#200
devguy
KRONOSFXI don't know why you are so hung up on IMC not being dedicated for every single core, what do you say about this, because if every core had his own IMC that would mean in a 4 core CPU every core would have just 32b bus instead of a shared IMC where if not all cores are active, one core can have 128b width and not just 1/4 and its impossible to have 128b for every single core in a 4 core cpu, that would mean 512b width memory access, look at SB-E it has just 256b memory access and they had to place two memory slots on both sides, just so it wouldn't be too complicated or expensive to manufacture.
It's not necessarily that I'm "hung up" on the IMC, nor do I personally believe that every core should have its own IMC. I was simply using it as an example. I think everyone can agree that the Athlon 64 3200+ has a single "core", and that Deneb has four "cores". How many resources were provided per core on the Athlon, that are instead shared on Deneb? Sure, feel free to ask whether or not said resources are actually part of what a CPU really is. However, none of us will have a good answer.
This one is easy. The fetch and decode unit. That's the "thinking" part. I have 2 hands, 2 legs, 2 lungs, but only 1 head and that makes me 1 person. No matter how many pair of things I have.
By all means, you're welcome to feel that is a necessary component of a "core". AMD does not; who's right? Who knows...? However, your analogy is somewhat non-applicable, as a "human" is definedas having the form of a human, and human form is definedas consisting of a head, neck, torso, two arms and two legs. I'm not aware of any such listing of components for a CPU.
Posted on Reply
Add your own comment
Apr 26th, 2024 10:29 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts