Wednesday, January 22nd 2014

AMD Debuts New 12- and 16-Core Opteron 6300 Series Processors

AMD today announced the immediate availability of its new 12- and 16-core AMD Opteron 6300 Series server processors, code named "Warsaw." Designed for enterprise workloads, the new AMD Opteron 6300 Series processors feature the "Piledriver" core and are fully socket and software compatible with the existing AMD Opteron 6300 Series. The power efficiency and cost effectiveness of the new products are ideal for the AMD Open 3.0 Open Compute Platform - the industry's most cost effective Open Compute platform.

Driven by customers' requests, the new AMD Opteron 6338P (12 core) and 6370P (16 core) processors are optimized to handle the heavily virtualized workloads found in enterprise environments, including the more complex compute needs of data analysis, xSQL and traditional databases, at optimal performance per-watt, per-dollar.

"With the continued move to virtualized environments for more efficient server utilization, more and more workloads are limited by memory capacity and I/O bandwidth," said Suresh Gopalakrishnan, corporate vice president and general manager, Server Business Unit, AMD. "The Opteron 6338P and 6370P processors are server CPUs optimized to deliver improved performance per-watt for virtualized private cloud deployments with less power and at lower cost points."

The new AMD Opteron 6338P and 6370P processors are available today through Penguin and Avnet system integrators and have been qualified for servers from Sugon and Supermicro at a starting price of $377 and $598, respectively. More information can be found on AMD's website.
Add your own comment

48 Comments on AMD Debuts New 12- and 16-Core Opteron 6300 Series Processors

#1
Breit
by: Aquinus
Might read into APUs again. There are benefits to be had by having HUMA on an APU, which solves the memory copying problem.
True, but the performance of a 7970 on an APU is not going to happen any time soon, I guess...
Posted on Reply
#2
james888
by: Breit
True, but the performance of a 7970 on an APU is not going to happen any time soon, I guess...
I just used a 7970 since I have a 7970 and knew its DP off hand. The architectural potential is there.

I got curious and looked for what DP an apu can get. This is the only thing I can find on current a10 7850k but its from WCCFtech so who knows its validity. 5800k on left, 7850k on right. Overclocked.
Posted on Reply
#3
eidairaman1
ive come to note all businesses will go with the cheapest parts available, plus most companies or people let alone dont know who AMD is.

But I say this is really good news for them, now they just need to make the 8core desktop parts more efficient
Posted on Reply
#4
Aquinus
Resident Wat-man
by: eidairaman1
ive come to note all businesses will go with the cheapest parts available, plus most companies or people let alone dont know who AMD is.

But I say this is really good news for them, now they just need to make the 8core desktop parts more efficient
Yeah! They need something to compete with Intel's 8 core Atom SoC. 20-watt TDP for an 8-core SoC isn't too shabby. There is a slower variant that offers lower clocks and less power usage but still retains 8 cores as well. I kind of want one.

by: Breit
True, but the performance of a 7970 on an APU is not going to happen any time soon, I guess...
As long as PCI-E is your bus and you have memory that is completely segregated from the CPU, you're going to have that issue. Remember how gimped Intel CPUs were when they used an MCH and how the CPU needed to communicate with the MCH to get anything out of memory. As soon as the memory controller was moved next to the CPU cores, memory access speeds started flying and latency dropped like a rock. The issue is that no software can take advantage of having stream processors and CPU cores both working on the same data. Sharing data between different CPU cores is problematic enough, forget sharing it with an array of SIMD cores.
Posted on Reply
#5
HumanSmoke
by: NC37
I dunno. In multithreading AMD was beating Intel. Xeons are another story
Well, in this instance since the article concerns Opteron, Xeon would actually be the story worth considering as counterpoint.

by: NC37
but when it comes to price for the performance. That I'd be interested to see.
Undoubtably, but then AMD are obviously going to make a concession for processor upgrade pricing in order to make ageing C32/G34 platforms at least somewhat palatable.
[Chart source]
Posted on Reply
#6
Thefumigator
by: Assimilator
And, sadly, the Xeons will still beat the ever living crap out of these.
Except in the price factor, maybe.
Posted on Reply
#7
FordGT90Concept
"I go fast!1!11!1!"
by: Aquinus
Intel uses unused resources in the CPU to get extra multi-threaded performance. AMD added extra hardware for multi-threaded performance as opposed to using just the extra resources available. The performance of a module vs the performance of a single core with HT has costs and benefits of their own. With an Intel CPU, that second thread doesn't nearly have as much processing power that the first thread does, where with AMD, the amount of performance that second "thread" or "core" if you will has much more tangible gains than the HT thread does.
In the testing I did, disabling HTT really crippled my 920 and as far as software is concerned, performance from the logical cores is expected to match performance from the physical cores. There's really no discernible difference between them.

It may bog down faster than AMD's SMT implementation but comparing Intel Xeon 6-core processors to AMD Opteron 6-core processors really doesn't show that to be the case either. Put bluntly, there's really no evidence to support AMD's SMT is any better than Intel's SMT.
Posted on Reply
#8
Thefumigator
by: FordGT90Concept
In the testing I did, disabling HTT really crippled my 920 and as far as software is concerned, performance from the logical cores is expected to match performance from the physical cores. There's really no discernible difference between them.
HT got better and better since day one. The 920 is a monster of CPU (lovely, yes), but I really can't believe a HT core matches performance to its physical core asociated to it.
I mean, if you make use of core 0 up to 100%, then core 1 should drop performance -quite- dramatically while core 2 wouldn't be affected at all.
Posted on Reply
#9
buildzoid
by: Thefumigator
HT got better and better since day one. The 920 is a monster of CPU (lovely, yes), but I really can't believe a HT core matches performance to its physical core asociated to it.
I mean, if you make use of core 0 up to 100%, then core 1 should drop performance -quite- dramatically while core 2 wouldn't be affected at all.
it doesn't when I turn HT off on my 3960X I lose at worst 40% of my multi threaded performance but it boost single threaded performance a little(1% maybe) as it takes load off of the data management part of the CPU.
Posted on Reply
#10
Aquinus
Resident Wat-man
by: FordGT90Concept
In the testing I did, disabling HTT really crippled my 920 and as far as software is concerned, performance from the logical cores is expected to match performance from the physical cores. There's really no discernible difference between them.

It may bog down faster than AMD's SMT implementation but comparing Intel Xeon 6-core processors to AMD Opteron 6-core processors really doesn't show that to be the case either. Put bluntly, there's really no evidence to support AMD's SMT is any better than Intel's SMT.
A while back I did some testing with my i7 and started disabling cores, leaving HT on and turning it off, to see how the performance difference between 4c/4t would be on my i7 and 2c/4t would be. In all honesty, the numbers don't agree with you. I can try and find it again, but generally speaking, hyper threading didn't yield much more than 30% improvement over a real core.

I'm curious, how did you test the performance of your CPU between disabling/enabling hyper threading?

Edit: Here, I found it.
Posted on Reply
#11
FordGT90Concept
"I go fast!1!11!1!"
by: Thefumigator
HT got better and better since day one. The 920 is a monster of CPU (lovely, yes), but I really can't believe a HT core matches performance to its physical core asociated to it.
I mean, if you make use of core 0 up to 100%, then core 1 should drop performance -quite- dramatically while core 2 wouldn't be affected at all.
Because that's not how it works. It functions a lot like virtualization where the physical core is never exposed to the operating system. Instead, there are four physical cores handling eight virtual cores--each physical core is responsible for two virtual cores. The physical cores are designed to work on each virtual core up to 50% of the time. This is why a quad core with SMT behaves very much like a slower eight physical core processor without SMT when handling heavy multithreaded loads.
by: buildzoid
it doesn't when I turn HT off on my 3960X I lose at worst 40% of my multi threaded performance but it boost single threaded performance a little(1% maybe) as it takes load off of the data management part of the CPU.
Yeah, the tests were multithreaded (four or eight threads). I didn't test anything single threaded.

by: Aquinus
A while back I did some testing with my i7 and started disabling cores, leaving HT on and turning it off, to see how the performance difference between 4c/4t would be on my i7 and 2c/4t would be. In all honesty, the numbers don't agree with you. I can try and find it again, but generally speaking, hyper threading didn't yield much more than 30% improvement over a real core.

I'm curious, how did you test the performance of your CPU between disabling/enabling hyper threading?

Edit: Here, I found it.

It looks like they got it fixed which is good (30% is about where it should be). My 920 is three generations older than yours.

My test method was this custom application. Basically it does var++ for 1 second across how many threads you tell it to of what type of variable and it tells you what it reached, repeats it 10 times and gives you the total and the results for every thread. It shows basic compute power of any given processor.

Is it even possible to disable SMT on AMD processors? Repeating the test you did with an AMD processor would tell us definitively what sort of difference SMT makes on them.
Posted on Reply
#12
xenocide
by: Aquinus
I would like everyone to remember what the equivalent Xeon is at that price point. I'm willing to bet that the Opteron is more cost effective, considering a 10 Core Xeon starts at 1600 USD, I think everything needs to be put into perspective. I would rather take two 16c Opterons than a single 10c Xeon, but that's just me.
The biggest problem with that is that the Opteron solutions tend to consume a lot more power. A Xeon (the benchmarks I see are mostly for E5-2660's) will consume about 95W, and a similar Opteron (in terms of performance--the Opteron 6380) consumes 151W under full load. That's a staggering difference. I would say over about a year the difference is made up in terms of cost, but the additional performance of the Xeon is not. People need to stop looking at exclusively initial investment costs and start considering things like heat generation, power consumption, and performance over time given a set cost.
Posted on Reply
#13
xorbe
And 7-zip is likely memory bound, making it an ideal case for HT.
Posted on Reply
#14
Thefumigator
by: FordGT90Concept
Because that's not how it works. It functions a lot like virtualization where the physical core is never exposed to the operating system. Instead, there are four physical cores handling eight virtual cores--each physical core is responsible for two virtual cores. The physical cores are designed to work on each virtual core up to 50% of the time. This is why a quad core with SMT behaves very much like a slower eight physical core processor without SMT when handling heavy multithreaded loads.
I disagree... I think you are wrong or I am misunderstanding you. HT will enable virtual cores, we agree with that. Windows will see those as cores, windows won't care if they are real or virtual, if its there it will assign any task or process to that free core. Performance on the other side will be impacted as soon as you use the virtual cores... do you mean that core 1 is not the "virtual core" from core 0?

by: FordGT90Concept

Is it even possible to disable SMT on AMD processors? Repeating the test you did with an AMD processor would tell us definitively what sort of difference SMT makes on them.
You can't disable half "module", but well, you can still use affinity option in windows. On my FX8320 after installing the windows patch, processes began being assigned differently in the CPU
Posted on Reply
#15
xorbe
by: Thefumigator
You can't disable half "module"
There was definitely an ASUS motherboard with a BIOS that would allow just that.
Posted on Reply
#17
buildzoid
by: xenocide
The biggest problem with that is that the Opteron solutions tend to consume a lot more power. A Xeon (the benchmarks I see are mostly for E5-2660's) will consume about 95W, and a similar Opteron (in terms of performance--the Opteron 6380) consumes 151W under full load. That's a staggering difference. I would say over about a year the difference is made up in terms of cost, but the additional performance of the Xeon is not. People need to stop looking at exclusively initial investment costs and start considering things like heat generation, power consumption, and performance over time given a set cost.
Your wrong and here is why:
A The Xeon 2697 v2 pulls 130W not 95 so 2 will pull(260W) almost as much as 3 of these Opterons(297W)
B These Opterons in the news pull 99W not 151
C The Cinebench R11.5 chart shows performance with perfect HT scaling so if you're using the server for data management and tasks that use the same part of the CPU over and over again the Xeons will be 40% slower than what Cinebench shows
D The Xeon 2697 v2 cost 2100$ more than one of these new Opterons which is a difference so big that the Xeon won't close the price gap any time soon definitely not in a year or 2
E The Xeon 2660 cost 700$ more than the Opterons in the news and pulls 95W while being barely faster than the old Opterons which the new ones will either match or beat so again the price gap won't be closed in less than 2 years
Posted on Reply
#18
FordGT90Concept
"I go fast!1!11!1!"
by: Thefumigator
I disagree... I think you are wrong or I am misunderstanding you. HT will enable virtual cores, we agree with that. Windows will see those as cores, windows won't care if they are real or virtual, if its there it will assign any task or process to that free core. Performance on the other side will be impacted as soon as you use the virtual cores... do you mean that core 1 is not the "virtual core" from core 0?
Assuming virtual core 0 and 1 are assigned to physical core 0, virtual core 2 and 3 are assigned to physical core 1, and so on and then you run 4 heavy threads on even numbered virtual cores and shift it to odd number virtual cores, the performance for both tests will be more or less equal. The physical core itself prioritizes threads from each virtual core and it tries to give each virtual core about equal processor time. This is why, from the software perspective, virtual core or physical core is moot.

by: FX-GMC
Their results show that if you were running an FX 8 core as a quad core, it is better to disable one core per module rather than disabling two whole modules.
Well, yeah...
Normal: 8 ALUs, 4 FPUs
Disable half modules: 4 ALUs, 4 FPUs
Disable two modules: 4 ALUs, 2 FPUs

I just wonder how much of a performance hit it takes in a generic benchmark between the first two senarios (half of the ALUs disabled). HTT loses about 30-35%. Looking at the URL, WinRAR is the closest to what Aquinus posted and 8 ALU/4 FPU scores about 47.6% higher than 4 ALU/4 FPU. That's slightly better than what Aquinus got but again, that's more of a memory benchmark than a compute power benchmark.

Edit: One of the users gave a range of: 33-59%. For HTT, it looks anywhere from 2-33%: http://semiaccurate.com/2012/04/25/does-disabling-hyper-threading-increase-performance/

Another article largely mirrors these results:
http://www.extremetech.com/computing/133121-maximized-performance-comparing-the-effects-of-hyper-threading-software-updates

I guess the morale of the story is that an AMD module struggles to keep up with an Intel core, HTT enabled or not. This is sad.
Posted on Reply
#19
Aquinus
Resident Wat-man
by: FordGT90Concept
I guess the morale of the story is that an AMD module struggles to keep up with an Intel core, HTT enabled or not. This is sad.
I wouldn't call it sad. Intel's current micro-architecture has evolved a bit since the Core and Core 2. AMD's design just isn't as mature. AMD CPUs keep up. They might not be better, but they're adequate. I wouldn't really call that sad.

What is sad is how AMD doesn't have more low power CPUs. For example, Intel has a 8c/8t Atom now, it's an SoC, and has a 20-watt TDP. I don't mean to contradict you, but performance isn't what's sad about AMD CPUs lately. In all honesty, if AMD cpus were a bit lighter on the power we probably wouldn't care as much about single-threaded performance being lacking.
Posted on Reply
#20
cyneater
by: Thefumigator
Except in the price factor, maybe.
Thats about the only thing AMD can win at now is the price factor.
Posted on Reply
#21
FordGT90Concept
"I go fast!1!11!1!"
by: Aquinus
I wouldn't call it sad. Intel's current micro-architecture has evolved a bit since the Core and Core 2. AMD's design just isn't as mature. AMD CPUs keep up. They might not be better, but they're adequate. I wouldn't really call that sad.

What is sad is how AMD doesn't have more low power CPUs. For example, Intel has a 8c/8t Atom now, it's an SoC, and has a 20-watt TDP. I don't mean to contradict you, but performance isn't what's sad about AMD CPUs lately. In all honesty, if AMD cpus were a bit lighter on the power we probably wouldn't care as much about single-threaded performance being lacking.
But that's my point. Except price, there's no where AMD wins. Intel has higher performance, less heat output, and lower power consumption. When it comes to servers and HPC where Opterons are found, upfront cost is not a selling point because they save money on lower power and cooling bills from Xeons. The only situation where AMD wins is if you only have X amount of money to spend right now and AMD is below that threshold while Intel is not.
Posted on Reply
#22
Breit
Just the right article posted yesterday: http://www.anandtech.com/show/7711/floating-point-peak-performance-of-kaveri-and-other-recent-amd-and-intel-chips
It is no secret that AMD's Bulldozer family cores (Steamroller in Kaveri and Piledriver in Trinity) are no match for recent Intel cores in FP performance due to the shared FP unit in each module. As a comparison point, one core in Haswell has the same floating point performance per cycle as two modules (or four cores) in Steamroller.
That means an AMD CPU needs four times the core-count to be equal clock-for-clock in FP performance to a Intel CPU! That makes this 16-core Opteron exactly as fast as an ordinary Intel quad-core clock-for-clock regarding FP performance! Didn't expect that... :)
Posted on Reply
#23
Thefumigator
by: FordGT90Concept

I just wonder how much of a performance hit it takes in a generic benchmark between the first two senarios (half of the ALUs disabled).
I guess the morale of the story is that an AMD module struggles to keep up with an Intel core, HTT enabled or not. This is sad.
I will test it out with my own benchmark software, when I get home
Posted on Reply
Add your own comment