Wednesday, August 28th 2019

AMD to Cough Up $12.1 Million to Settle "Bulldozer" Core Count Class-Action Lawsuit

AMD reached a settlement in the Class Action Lawsuit filed against it, over alleged false-marketing of the core-counts of its eight-core FX-series processors based on the "Bulldozer" microarchitecture. Each member of the Class receives a one-time payout of USD $35 per chip, while the company takes a hit of $12.1 million. The lawsuit dates back to 2015, when Tony Dickey, representing himself in the U.S. District Court for the Northern District of California, accused AMD of false-marketing of its FX-series "Bulldozer" processor of having 8 CPU cores. Over the following four years, the case gained traction as a Class Action was built against AMD this January.

In the months that followed the January set-up of a 12-member Jury to examine the case, lawyers representing the Class and AMD argued over the underlying technology that makes "Bulldozer" a multi-core processor, and eventually discussed what a fair settlement would be for the Class. They eventually agreed on a number - $12.1 million, or roughly $35 per chip AMD sold, which they agreed was "fair," and yet significantly less than the "$60 million in premiums" consumers contended they paid for these processors. Sifting through these numbers, it's important to understand what the Class consists of. It consists of U.S. consumers who became interested to be part of the Class Action, and who bought an 8-core processor based on the "Bulldozer" microarchitecture. It excludes consumers of every other "Bulldozer" derivative (4-core, 6-core parts, APUs; and follow-ups to "Bulldozer" such as "Piledriver," "Excavator," etc.).
Image Credit: Taylor Alger Source: The Register
Add your own comment

288 Comments on AMD to Cough Up $12.1 Million to Settle "Bulldozer" Core Count Class-Action Lawsuit

#276
Vya Domus
Certainly not when the water is stale.

Bulldozer has 8 cores. You've only speculated what you think a core is by constantly inventing new definitions and rules outside the subject and context in which this was discussed, that's a big difference. And sources ? Don't make me laugh, you don't get to say that when your response to actual material that proved my point was "they lied".
Posted on Reply
#278
seronx
FordGT90Concept, post: 4107883, member: 60463"
So you see the problem? Bulldozer "execution cores" lack the hardware to decode AMD64 instructions which is a function of the "core" (aka processor). "Execution cores" as defined in Bulldozer lack the hardware necessary to be considered a "core:" they are merely "execution units." ...and these are the wheels the turn the gears of false advertising.
"More specifically, this invention relates to processors that convert an x86 instructions into RISC-type operations for execution on a RISC-type core."
"The core of the processor is a RISC superscalar processing engine."
"The heart of the AMD-K6 processor is a RISC core known as the enhanced RISC86 microarchitecture."
"The AMD-K5 processor’s superscalar RISC core consists of six execution units: two arithmetic logic units (ALU), two load/store units, one branch unit, and one floating-point unit (FPU). This superscalar core is fully decoupled from the x86 bus through the conversion of variable-length x86 instructions into simple, fixed-length RISC operations (ROPs) that are easier to handle and execute faster. Once the x86 instruction has been converted, a dispatcher issues four ROPs at a time to the superscalar core. The processor’s superscalar core can execute at a peak rate of six ROPs per cycle. The superscalar core supports data forwarding and data bypassing to immediately forward the results of an execution to successive instructions."
"AMD-K6 MMX Processor : High-performance RISC core : Yes / 6-issue (RISC86)"
"The execution engine implements a superscalar, out-of-order, reduced instruction set computing (RISC) architecture"
"The dual instruction decoders translate X86 instructions on-the-fly into corresponding RISC86 Ops. The RISC86 Ops are executed by an instruction core that is essentially a RISC superscalar processing engine."
"From the viewpoint of packing multiple primitive operations into a coarser schedulable unit and performing schedule and execution of macro-ops, the proposed microarchitecture employs a counter approach to recent x86 processor implementations that crack a CISC instruction and convert it into multiple RISC semantics running on RISC-style cores."
"As a coarser-grained approach in the opposite direction, the AMD K7 and the Intel Pentium M have adopted techniques to allow an issue queue entry to accommodate multiple micro-ops as a form of fused operations for certain types of x86 instructions. Original micro-ops are loosely coupled in a fused operation from the scheduler’s perspective; they are scheduled individually according to the readiness of corresponding source operands."
RISC86 which poorly interpreted x86 to Macro-ops which were better interpreted for x86.

"AMD-K7 ™ Processor Architecture => Three Parallel x86 Instruction Decoders => Decoding Pipelines can dispatch 3 MacroOps to Execution Unit Schedulers, Load / Store Queue Unit => Result Busses from Core"
^-- this one is the most intriguing as the only mention of a core is for the LSU slide.


"The AMD Athlon processor microarchitecture is a decoupled decode/execution design approach. In other words, the decoders essentially operate independent of the execution units, and the execution core uses a small number of instructions and simplified circuit design for fast single-cycle execution and fast operating frequencies."

Then, K8 happens... oh dear...
"The AMD64 architecture employs a decoupled decode/execution design approach. In other words, decoders and execution units essentially operate independently; the execution core uses a small number of instructions and simplified circuit design for fast single-cycle execution and fast operating frequencies."
"The AMD Athlon 64 and AMD Opteron processors implement the AMD64 instruction set by means of micro-ops—simple fixed-length operations designed to include direct support for AMD64 instructions and adhere to the high-performance principles of fixed-length encoding, regularized instruction fields, and a large register set. The enhanced microarchitecture enables higher processor core performance and promotes straightforward extensibility for future designs"

That is all dandy, but then it explodes: CPU cores, cores, processor cores, etc. Which isn't the core they originally defined; "This superscalar core is fully decoupled from the x86 bus through the conversion of variable-length x86 instructions into simple, fixed-length RISC operations (ROPs) that are easier to handle and execute faster."

"A processor core for supporting the concurrent execution of mixed integer and floating point operations includes integer functional units utilizing 32-bit operand data and a floating point functional unit utilizing up to 82-bit operand data."
https://patentimages.storage.googleapis.com/ff/c4/e7/7da222a99f9ccb/US5574928-drawings-page-11.png
"FIG. 13 is a schematic diagram of a layout of a mixed floating point/integer processor core"
Posted on Reply
#279
FordGT90Concept
"I go fast!1!11!1!"
Bravo! TL;DR: that's the story of how the x86 front-end is interpreted into a series of microOPs which are executed by the execution units.

Spoiler: Technical mumbo jumbo
seronx, post: 4107982, member: 86156"
That is all dandy, but then it explodes: CPU cores, cores, processor cores, etc. Which isn't the core they originally defined; "This superscalar core is fully decoupled from the x86 bus through the conversion of variable-length x86 instructions into simple, fixed-length RISC operations (ROPs) that are easier to handle and execute faster."
Exactly why we have two definitions of "core" now and why it is imperative to declare which is being discussed, especially in marketing materials. In this quote (without extra context), it sounds like they're referring to an execution unit as a "superscalar core." P5 was the first superscalar x86 architecture and, if you look at what it had (MMX, FPU, ALU-Y, ALU-U), it's easy to understand why they went with a superscalar approach:

I make it clear what is what here:

You put two execution units in a core, you get two execution units, not two cores. The reason why everyone went with SMT now is because SMT has all of the benefits of more execution units without putting a physical wall between them. Wider execution units are preferable to many execution units...at least when dealing with x86...because the odds of being able to saturate all of the pipelines in the execution unit are better. This means greater efficiency.

The picture above is in line with the terms AMD settled with: "core" includes fetcher to L1 data cache.

If you disagree, I remind you that Kumar et. al. (likely the inspiration for Bulldozer) described the design as a "conjoined-core" intentionally omitting the plural form of core. That is not a mistake and the hyphen makes it clear which definition of the word "core" is used for the purpose of the technical paper.

Going back to K8...that was actually mirrored to become the first dual-core x86 processor...and AMD put it on the box loud and proud:


I actually found the product PDF too, straight off of AMD's servers:
Athlon 64 X2 Dual-Core Product Data Sheet

Looky what it says about "dual-core:"

AMD explicitly *includes* L2 (off to the left) in their definition of "core" which means the definition of "core" goes far beyond just the "execution unit" (left part of the gray area):

The only parts of this image that AMD effectively excluded from the definition of "core" is SRQ, XBar, and IMC.

See what happened there? AMD contradicted themselves with Bulldozer and the motive is clear (over-represent their product).


Spoiler: Intel concurs with AMD Athlon 64 X2 definition of core
As for Intel? Well, they were never so brash as to put "#-core" on a box as AMD did. They usually like to hide it behind model numbers and fine print. Anyway, let's have a look at Pentium D which is two MCM'd Pentium 4s:
https://ark.intel.com/content/www/us/en/ark/products/27512/intel-pentium-d-processor-820-2m-cache-2-80-ghz-800-mhz-fsb.html
When you click the little (?) next to "# of cores" it says: "Cores is a hardware term that describes the number of independent central processing units in a single computing component (die or chip)." Remember how I said Core 2 Quad has four processors? That's basically Intel's words, innit? Intel has been extremely consistent in describing their "cores" as discreet processors networked together since the beginning.
Posted on Reply
#280
seronx
FordGT90Concept, post: 4108069, member: 60463"
See what happened there? AMD contradicted themselves with Bulldozer and the motive is clear (over-represent their product).
No, AMD contradicted themselves with K8. What K8 defines as a core is a processor. Whereas what Bulldozer defines as a core is a core.

K8 is a dual-processor CPU with each processor having a single core.
Family 15h 70h-7Fh is a single-processor CPU with that one processor having two cores.

Bulldozer uses the older definition with priority rather than K8s which seems to be a marketing gaff.

Even, Intel called it the core with AMD;
"In-Order Front End -> Its job is to supply a high-bandwidth stream of decoded instructions to the out-of-order execution core, which will do the actual completion of the instructions. These IA-32 instruction bytes are then decoded into basic operations called uops (micro-operations) that the execution core is able to execute."

"The P6 microarchitecture is made up of in-order front end, out-of-order core and in-order retirement units.

The front end includes Instruction Fetch, Instruction Decode, Branch Target Buffer, Micro-instruction Sequencer, and Register Address Table units. The out-of-order core is made up several execution units; the units include Floating Point Execution units, Integer Execution units, and Address Generation units. The in-order retirement back end includes the Re-order Buffer and the Register Retirement File units."

"Intel ® Microarchitecture Code Name Sandy Bridge:
-> An in-order issue front end that fetches instructions and decodes them into micro-ops (micro-opera-
tions). The front end feeds the next pipeline stages with a continuous stream of micro-ops from the
most likely path that the program will execute.
-> An out-of-order, superscalar execution engine that dispatches up to six micro-ops to execution, per
cycle. The allocate/rename block reorders micro-ops to "dataflow" order so they can execute as soon
as their sources are ready and execution resources are available.

-> An in-order retirement unit that ensures that the results of execution of the micro-ops, including any
exceptions they may have encountered, are visible according to the original program order.

The out-of-order core consist of three execution stacks, where each stack encapsulates a certain type of
data. The execution core contains the following execution stacks:
• General purpose integer
• SIMD integer and floating-point
• X87"

Intel used AMD's misdefintion in Core Duo, Core Quad, and pretty much everything relating to multi-processors. However, the docs point to the out-of-order portion of the Intel processor as the core.

Multi-core processor => AMD's Family 15h(two cores)/Sun's Rock(four cores)
Multi-processors with single-cores => K8, Core Duo, Core 2 Quad, Athlon/Phenom X4, etc.
Multi-processors with multi-cores => Family 15h Model 00h-0Fh(8 cores)/Sun's rock(16 cores)
Posted on Reply
#281
FordGT90Concept
"I go fast!1!11!1!"
seronx, post: 4108272, member: 86156"
K8 is a dual-processor CPU with each processor having a single core.
That is not what AMD put on the box. It says "dual-core" not "dual-processor." They confirm the two are one in the same in the product data sheet.

They not only said the same thing on Phenom II but they added "true quad-core design:"

Remember why they did that? Because AMD's marketeers felt the Core 2 Duo "module" wasn't really a "dual-core" because of the shared L2 cache with AMD processors didn't share until Bulldozer (among many other things), if memory serves.

Hilarious, isn't it? It's almost like AMD kept changing the definition of "core" to suit themselves. Because they did. The only difference is that AMD's moving goal post didn't really mean anything to consumers until Bulldozer because they were pushing this notion that consumers were getting twice the number of cores as they really were. That's an argument that can be taken to court for damages--and they were.


As I pointed out: 2005 established the definition of what a "dual-core processor" was and AMD and Intel were united on that front. Intel has been consistent since; AMD has not: dual-core (Athlon 64 X2) = processor replication -> "true" dual-core (Phenom) = no sharing L2 cache (this was and remains a stupid argument) -> conjoined-core (Bulldozer) = really one core but we're going to sell it as two -> quad-core (Zen) = processor replication. I think it's fairly safe to say that this ridiculous AMD chapter is closed for good.

seronx, post: 4108272, member: 86156"
Even, Intel called it the core with AMD;
"In-Order Front End -> Its job is to supply a high-bandwidth stream of decoded instructions to the out-of-order execution core, which will do the actual completion of the instructions. These IA-32 instruction bytes are then decoded into basic operations called uops (micro-operations) that the execution core is able to execute."

"The P6 microarchitecture is made up of in-order front end, out-of-order core and in-order retirement units.

The front end includes Instruction Fetch, Instruction Decode, Branch Target Buffer, Micro-instruction Sequencer, and Register Address Table units. The out-of-order core is made up several execution units; the units include Floating Point Execution units, Integer Execution units, and Address Generation units. The in-order retirement back end includes the Re-order Buffer and the Register Retirement File units."

"Intel ® Microarchitecture Code Name Sandy Bridge:
-> An in-order issue front end that fetches instructions and decodes them into micro-ops (micro-opera-
tions). The front end feeds the next pipeline stages with a continuous stream of micro-ops from the
most likely path that the program will execute.
-> An out-of-order, superscalar execution engine that dispatches up to six micro-ops to execution, per
cycle. The allocate/rename block reorders micro-ops to "dataflow" order so they can execute as soon
as their sources are ready and execution resources are available.

-> An in-order retirement unit that ensures that the results of execution of the micro-ops, including any
exceptions they may have encountered, are visible according to the original program order.

The out-of-order core consist of three execution stacks, where each stack encapsulates a certain type of
data. The execution core contains the following execution stacks:
• General purpose integer
• SIMD integer and floating-point
• X87"

Intel used AMD's misdefintion in Core Duo, Core Quad, and pretty much everything relating to multi-processors. However, the docs point to the out-of-order portion of the Intel processor as the core.

Multi-core processor => AMD's Family 15h(two cores)/Sun's Rock(four cores)
Multi-processors with single-cores => K8, Core Duo, Core 2 Quad, Athlon/Phenom X4, etc.
Multi-processors with multi-cores => Family 15h Model 00h-0Fh(8 cores)/Sun's rock(16 cores)
Quotations without citations are plagiarism.

Your Sandy Bridge quote yet again makes my point for me: they quit calling it a "core" because since 2005, that means exclusively processor; they instead call it a "superscalar execution engine" which eliminates the confusion. Call anything that isn't a complete processor inside of a multi-processor chip a "core" today, beware of the lawyers.
Posted on Reply
#282
seronx
FordGT90Concept, post: 4108289, member: 60463"
That is not what AMD put on the box. It says "dual-core" not "dual-processor." They confirm the two are one in the same in the product data sheet.

They not only said the same thing on Phenom II but they added "true quad-core design:"
The definition of a processor includes a core. So, regardless a processor must have a core. So, four processors would always net at minimum four cores.

Phenom X4/Phenom II X6 can either be referred by the processor count or the core count. It would still be correct. However, it isn't a true quad-core design for your reasoning. It is a monolithic quad-core design, where as the Core 2 Quad was two monolithic dual-core designs.

"Remember why they did that? Because AMD's marketeers felt the Core 2 Duo "module" wasn't really a "dual-core" because of the shared L2 cache with AMD processors didn't share until Bulldozer (among many other things), if memory serves."
As stated above, it was a jab at Intel for their MCM design.



Intel's Xeon is a true 28-core/28-processor design, while AMD's EPYC is only an 8-core/8-processor design * 4.

Relative to x86/x86-64, the Bulldozer compute unit processor microarchitecture, is the first true x86 dual-core processor. Athlon X2 for example is two single-core processors that are glued together.

K8/10h/12h processor microarchitecture doesn't include two cores; Only AMD Bulldozer's compute unit processor microarchitecture contains two cores. Of the two, only Bulldozer can keep the native dual-core architecture. So, if AMD comes out with an octo-core processor microarchitecture, then neither FX or Ryzen can be said to be natively octo-core.

With AMD's Stoney Ridge, it can be called a native/true/etc dual-core processor and it will convey the message well. However, AMD for Raven2 would definitely want to market it as a two single-core processors design glued together through the shared L3 cache. Raven2 isn't natively or truely a dual-core processor as it is two replicated single-core processors which are glued together via L3.

Raven2 without SMT => Core 0 writes X in a FPU register, Core 1 is dependent on X for a FPU opteration. 100s to 1000s of cycles from PRF(core 0) -> L1(core 0) -> L2(core 0) -> L3(shared) -> L2(core 1) -> L1(core 1) -> PRF(core 0) -> execution.
^- non-native dual-core

Stoney with CMT => Core 0 writes X in a FPU, Core 1 is dependent on a X for a FPU operations. A few cycles from renaming -> PRF-tag with Core0 -> PRF-tag with core 1 -> execution.
^- native dual-core
Posted on Reply
#283
FordGT90Concept
"I go fast!1!11!1!"
You're missing the point. Again. AMD explicitly said that K8 "core" included L1 and L2 caches which includes front end, execution units, L2 cache, floating point units, and everything in between. Intel mirrored that definition since Pentium D.

This thing you like calling a "core" has a dozen names that fundamentally means the same thing but it is no longer singularly called a "core," it is always prefaced with a descriptor (superscalar, out of order, execution, integer, etc.). "Core" by itself, especially in marketing, means one thing and one thing alone since 2005: a processor which includes front end, the various types of units, and sometimes L2+ caches. It's hardware that receives x86 instructions, processes, and returns the full result.

This feels like a merry-go-round. It's settled. I'm done here.
Posted on Reply
#284
seronx
FordGT90Concept, post: 4108341, member: 60463"
AMD explicitly said that K8 "core" included L1 and L2 caches which includes front end, execution units, L2 cache, floating point units, and everything in between. Intel mirrored that definition since Pentium D.
The earlier definition has priority or precedence in this case. However neither definitions are technically correct. A core is only; a control unit, a datapath, an instruction bus, and a data bus.

The control unit in K7/K8/10h/12h is called the instruction control unit. Which is directly interconnected with the 3-wide OoO integer datapath and remotely interconnected with the 3-wide OoO floating point datapath. The instruction bus and data bus are easily confirmed can instructions flow in and can data flow out. Boom its a core.

K7 core is instruction control + integer datapath
K8 core is instruction control + integer datapath
Family 10h/12h core is instruction control + integer datapath
Family 15h core is instruction control + integer datapath, which Family 15h's compute unit has TWO.
etc.

L1 cache, L2 cache, FPU, Front-end, etc are not part of the core.
Posted on Reply
#285
Keviny Oliveira
FordGT90Concept, post: 4107875, member: 60463"
"cores" which share resources in a "conjoined-core." These do not qualify as "processors."
But not make sense, lol, If the processor has 2 cores him is a dual core, even though it has lower performance than a 1-core processor, it is logical that FXs cores are non-logical physical, so the FX 8300 is an octa-core where certain tasks have the same performance as a quad core.
Posted on Reply
#286
R-T-B
FordGT90Concept, post: 4107901, member: 60463"
Remember how Sun designed a conjoined-core on steroids? Why do you think they never released it?
Probably because Oracle bought them out and released it as one of the many shared FPU sparcs at the time...

Keviny Oliveira, post: 4114365, member: 190162"
But not make sense, lol, If the processor has 2 cores him is a dual core, even though it has lower performance than a 1-core processor, it is logical that FXs cores are non-logical physical, so the FX 8300 is an octa-core where certain tasks have the same performance as a quad core.
Wut
Posted on Reply
#288
Aquinus
Resident Wat-man
I've read this thread somewhere else... and I'm still not convinced because they're still the same flawed arguments predicated on "what has been in a core," not, "what actually defines a core," on top of the fact that such rigid definitions of a "core" is only going to tie new designs to an archaic way of looking at things just for the purposes of legal bullshit.

Honestly, I don't think this has anything to do with if they're cores or not. That's just the legal argument. I honestly think that this really is about some rich idiot who feels duped because he didn't understand what he was buying because a normal person isn't going to buy a CPU then choose to sue because it wasn't good enough because most people are bright enough to look at reviews and make a judgement call themselves. Most people also aren't willing to invest the resources in suing because it takes money to do that.

I honestly think this entire debate is despicable.

I also think the last group of people who are qualified to make this call are armchair warriors.
Posted on Reply
Add your own comment