• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Dragged to Court over Core Count on "Bulldozer"

Joined
Jul 18, 2007
Messages
2,693 (0.44/day)
System Name panda
Processor 6700k
Motherboard sabertooth s
Cooling raystorm block<black ice stealth 240 rad<ek dcc 18w 140 xres
Memory 32gb ripjaw v
Video Card(s) 290x gamer<ntzx g10<antec 920
Storage 950 pro 250gb boot 850 evo pr0n
Display(s) QX2710LED@110hz lg 27ud68p
Case 540 Air
Audio Device(s) nope
Power Supply 750w superflower
Mouse g502
Keyboard shine 3 with grey, black and red caps
Software win 10
Benchmark Scores http://hwbot.org/user/marsey99/
don't stop!

this thread has given me hours of entertainment :D

and some insight tbh :)
 
Joined
Oct 2, 2004
Messages
13,791 (1.93/day)
But again, what matters in the end is performance. AMD opted for such core design. Call them half cores or not true cores all you want, they are cores presented to the system and there are 8 of them. If they don't perform as expected, why the fuck are there 5 trillion review sites for then? Clueless people will get screwed (or shall we say they screw themselves) for not asking the right people or checking reviews. Technically speaking, if CPU had just 1 core and companies advertised it as such, no one would buy it, even if that single core literally raped all the multi-core CPU's in the market. Without looking at reviews, you can't possibly tell how well it performs. So, how different is going to the other extreme, 8 cores that supposedly aren't "real" cores?

Intel's HT really can't be called a core, because it can't be called so on any level even though I've seen really weird namings of i7 CPU's with HT on very popular German webpage Computer Universe. AMD can't just call it quad core with 6,5 threads. It would confuse the fuck out of users. So they opted for calling cores the way they are presented to the system.

Also, look at the task manager...

FX8320.jpg


It's not exactly a tightly kept secret that required rocket scientists to figure it out. 1 processor, 4 cores, 8 logical units. Difference is, those are actually cores, even though different design than one used by Intel. HT on the other hand doesn't have any kind of core appearance. It's just a side logic that tricks OS into thinking it's another core and gives CPU ability to stack more computation on the same physical core. It's confusing to casual users, but I wouldn't call it cheating on the AMD's end...
 

Frick

Fishfaced Nincompoop
Joined
Feb 27, 2006
Messages
18,930 (2.85/day)
Location
Piteå
System Name Black MC in Tokyo
Processor Ryzen 5 5600
Motherboard Asrock B450M-HDV
Cooling Be Quiet! Pure Rock 2
Memory 2 x 16GB Kingston Fury 3400mhz
Video Card(s) XFX 6950XT Speedster MERC 319
Storage Kingston A400 240GB | WD Black SN750 2TB |WD Blue 1TB x 2 | Toshiba P300 2TB | Seagate Expansion 8TB
Display(s) Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case Fractal Design Define R4
Audio Device(s) Line6 UX1 + some headphones, Nektar SE61 keyboard
Power Supply Corsair RM850x v3
Mouse Logitech G602
Keyboard Cherry MX Board 1.0 TKL Brown
VR HMD Acer Mixed Reality Headset
Software Windows 10 Pro
Benchmark Scores Rimworld 4K ready!
Intel's HT really can't be called a core, because it can't be called so on any level even though I've seen really weird namings of i7 CPU's with HT on very popular German webpage Computer Universe. AMD can't just call it quad core with 6,5 threads. It would confuse the fuck out of users. So they opted for calling cores the way they are presented to the system.

Has nothing to do with the topic, but stores sold the first generation i3/i5/i7 CPU's as CPU's with three, five and seven cores.
 
Joined
Oct 2, 2004
Messages
13,791 (1.93/day)
It has to do with the topic. Because what people consider as 4 core 8 thread Intel CPU cannot be applied to AMD CPU's. If it says 8 cores, it actually has that many cores. If they are really as effective as Intel's cores number vs number, that's debatable. And that's why reviews exist. In the end, it doesn't matter if number of cores is the same or how effective they are per core or in multi-core arrangement. You have to see benchmarks in either case.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.63/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.

Microsoft would call them cores if they fit the definition of a core.
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
L2 is part of the core, huh Ford? I'm pretty sure that Core 2 duos, having a shared L2, still were individual cores. Might want to work on that diagram a bit instead of posting it incessantly. Just like control logic is part of the core too, huh? Lets stick with facts and less home-made bullshit.
 
Joined
Oct 2, 2004
Messages
13,791 (1.93/day)
The reason why they prefer to use split dedicated caches is to avoid cache trashing. L3 is so far ahead it's almost like a RAM so it's not important anymore.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.63/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
L2 is part of the core, huh Ford? I'm pretty sure that Core 2 duos, having a shared L2, still were individual cores. Might want to work on that diagram a bit instead of posting it incessantly. Just like control logic is part of the core too, huh? Lets stick with facts and less home-made bullshit.
A core doesn't share any resources with another core. If an L2 cache is shared between two or more cores, none of the cores can claim it as theirs.

In the case of Bulldozer, the L2 cache is shared between the FPU and the two integer clusters. It is not shared with another core so, as the diagram shows, it is correct. One bulldozer core (containing two integer clusters) includes the L2 cache.


In the case of Core 2 Duo, the L2 cache is shared between two cores so the L2 cache is not part of either core. The two discreet cores (purple background) packaged together with the L2 cache is a module (green square):


Core 2 Quad was created by combining two dual-core modules producing a multi-chip module (MCM) quad-core CPU:



The reason why they prefer to use split dedicated caches is to avoid cache trashing. L3 is so far ahead it's almost like a RAM so it's not important anymore.
L3 was added because of the massive performance drop between L2 and RAM. Some processors are getting an L4 cache because of the massive performance drop between L3 and RAM.
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
The reason why they prefer to use split dedicated caches is to avoid cache trashing. L3 is so far ahead it's almost like a RAM so it's not important anymore.
Yessir. The hit rates on CPU cache nowadays are nutty high, north of 85-90% in a lot of cases, which probably explains why faster memory doesn't do a whole lot of good.
Some processors are getting an L4 cache because of the massive performance drop between L3 and RAM.
You mean the eDRAM cache? That's strictly for the iGPU if I recall correctly because the only chips that sport it are ones with Iris Pro.
 
Last edited:
Joined
Nov 29, 2011
Messages
5,975 (1.32/day)
Location
Hi! I'm from the Internet
System Name Selene / Yoda
Processor Fx 8350 @ 4.4 / Phenom II x6 1090t @ 3.6
Motherboard Gigabyte 990FXA-UD3 r4.0 / Gigabyte 890XA-UD3
Cooling H100i / Xig Dark Knight
Memory 4x 8gb G.Skill Snipers / 4x 4gb G.Skill Ares
Video Card(s) Gigabyte R9 290x / XfX DD & VisionTek HD6850's C'fired
Storage 256gb ssd, 2x 2tb Wd Blacks & 1x 1tb Wd black / 1x 1tb
Display(s) Dell Ultra Sharp 2408 WFp / Hp w2207
Case Raidmax Vampire / Chieftec Alum. Dragon Blue
Audio Device(s) Onboard Hd Audio / Onboard Hd Audio
Power Supply Corsair TX 850 watt / Corsair TX 750 watt
Mouse Logitech G500s
Keyboard Corsair Strafe
Software Win 10 pro / Win Vista Home prem. 64 bit
Benchmark Scores What are benchmarks anyway?
I think that depends on what version of windows you are using. Under win 7 fx8's show up as 8 cpus and win 10 they show up as 4 cpus with 8 threads. I think this was done to help with the performance of Amd processors, but not totally sure on that.
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
I think that depends on what version of windows you are using. Under win 7 fx8's show up as 8 cpus and win 10 they show up as 4 cpus with 8 threads. I think this was done to help with the performance of Amd processors, but not totally sure on that.
There is a minor performance hit when using the second core in the module. Probably as @FordGT90Concept described as how the decoder was getting overwhelmed which is why they added a second one in Steamroller.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.63/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
You mean the eDRAM cache? That's strictly for the iGPU if I recall correctly because the only chips that sport it are ones with Iris Pro.
The eDRAM can be used by Iris Pro and the CPU:
http://www.anandtech.com/show/6993/intel-iris-pro-5200-graphics-review-core-i74950hq-tested/3
AnandTech said:
Unlike previous eDRAM implementations in game consoles, Crystalwell is true 4th level cache in the memory hierarchy. It acts as a victim buffer to the L3 cache, meaning anything evicted from L3 cache immediately goes into the L4 cache. Both CPU and GPU requests are cached. The cache can dynamically allocate its partitioning between CPU and GPU use. If you don’t use the GPU at all (e.g. discrete GPU installed), Crystalwell will still work on caching CPU requests. That’s right, Haswell CPUs equipped with Crystalwell effectively have a 128MB L4 cache.
It does not act as a frame buffer for the Iris Pro. Intel hinted at a separate, 16-32 MiB ESRAM could be used exclusively for Iris Pro's frame buffer in the future. Skylake-H will likely be getting the same Crystalwell L4 cache as Broadwell. We could see the same Crytalwell cache spring up on even more chips in the future (Kaby Lake, maybe even Cannonlake).


There is a minor performance hit when using the second core in the module. Probably as @FordGT90Concept described as how the decoder was getting overwhelmed which is why they added a second one in Steamroller.
Even in Excavator, the prefetch and FPUs are still shared. There's going to be a performance hit from them too. A legitimate dual-core doesn't share those things as demonstrated by the Core 2 Duo and Phenom II block diagrams.


I did some more digging on Core 2 Duo and it appears that neither core can be disabled. Conroe-L (single-core) appears to be a different chip altogether. This makes Core 2 Duo a true module because it has two of everything except L2 and control which makes them inseparable. Bulldozer is not a module because it doesn't have two of everything--it has one of some things. This is why FX-8350 should be considered a quad-core. What was previously understood as a module (complete but inseparable cores) is absent (needs two prefetchers at minimum).
 
Last edited:
Joined
Oct 2, 2004
Messages
13,791 (1.93/day)
I was hoping Skylake would get L4 by default (current i7 6700k for example), but after I've seen it's basically just a smaller i7 5000 series, I just didn't bother and opted for more cores instead on 5820K.
 

cdawall

where the hell are my stars
Joined
Jul 23, 2006
Messages
27,680 (4.27/day)
Location
Houston
System Name All the cores
Processor 2990WX
Motherboard Asrock X399M
Cooling CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory 4x16GB G.Skill 3600
Video Card(s) (2) EVGA SC BLACK 1080Ti's
Storage 2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s) Dell UP2414Q 3840X2160@60hz
Case Caselabs Mercury S5+pedestal
Audio Device(s) Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply Seasonic Prime 1200w
Mouse Thermaltake Theron, Steam controller
Keyboard Keychron K8
Software W10P
Scaling would show that an FX 8 core has more than 4 cores. Math would say it is physically impossible to say differently.
 

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
40,435 (6.58/day)
Location
Republic of Texas (True Patriot)
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
That was Windows XP and XP only has two states: uniprocessor (one thread at a time) and multiprocessor (two or more threads at a time). Multiprocessor could mean two physical sockets with one core each, one socket with two cores, or one physical + one logic processor. It was updated to better handle the three variations.

Bulldozer did the same thing with Vista. Vista (I believe 7 too) called it eight-cores because it was incapable of distinguishing them but that apparently caused problems because updates were released to fix core parking issues. Come Windows 8 and newer, Microsoft updated the operating system to definitively account for sockets, cores, and logic processors which is where we see 4 cores and 8 logic processors.


CPU-Z doesn't need to schedules threads. Windows does. Microsoft did what they did deliberately so the scheduler best utilizes the processor resources.


Caches have always been tiered. The closer the tier is to the ALUs and FPUs, the faster it is. Caches completely lack logic and there's numerous advantages, and virtually no disadvantages, to sharing caches (scheduler will allot the cache evenly when the load is even).

There's only a handful of FPUs shared in the computing world outside of Bulldozer (and derivatives) and all of them are set up in a way that resembles a co-processor. That is, it has it's own scheduler and all of the cores can queue work to it--effectively its own core. They don't market it as having an extra core though because that would be misleading.
Still on 7 myself.
Yes I got all updates plus core unparker tool. The FX8350 does more than I could imagine.
 

MalakiLab

New Member
Joined
Sep 25, 2016
Messages
11 (0.00/day)
In the case of Core 2 Duo, the L2 cache is shared between two cores so the L2 cache is not part of either core. The two discreet cores (purple background) packaged together with the L2 cache is a module (green square):

Let me show you the Intel Silvermont, C2000, eight cores architecture.
All new Atoms have modules, with 2 cores in it, sharing the same L2 cache. Are they liars too?
4_17.jpg



Your graphic you made is also completely wrong. It shows how you don't understand OoO, PRF, branch prediction, resource monitoring. http://www.anandtech.com/show/3922/intels-sandy-bridge-architecture-exposed/3

In short, you don't understand how their microarchitecture work. 95% of the time, the module will work just the same as 2 cores, because both can share the resource in SAME TIME. In most circumstances, it will use both Integer core and each one will have a 128-bit FMAC with 128-bit Integer execution. So they can simultaneously execute most of the instructions independently without having to wait for it's turn like for hyperthreading. Totally different microarchitecture. When things begin degrading itself is when both floating point pipelines have to get together for a single integer core, to execute a single 256-bit AVX instruction, or two symmetrical SSE instructions. Then the entire FPU is taken and leave no resources to the other integer core. In theory the dispatch controller should give the integer core some instructions not needing any FPU interaction, by going to see in the instructions fetch buffer, and being able to keep it busy while the other complete it's cycles needing all the FPU. On paper it looks awesome, but it's a very very complex operation, sadly not bringing much success. Luckily, those instructions are not very often used. Still, it's a major problem AMD tried to improve in Piledriver, Steamroller and finally Excavator. It was their way to deal with new instructions too, and stay in competition.

It's a good technology, but a little too audacious for today's market. Instead of focusing on having better IPC, they mostly developed way to better dispatch the instructions. That's why they decided to come back to more traditional microarchitectures and be more competitive IPC-wise. It doesn't change the fact a module behave like 2 cores and are in fact 2 cores in a single module. Even Intel agree to that and are using modules for their Atoms. Maybe we should drag them in court too, no?
 

cdawall

where the hell are my stars
Joined
Jul 23, 2006
Messages
27,680 (4.27/day)
Location
Houston
System Name All the cores
Processor 2990WX
Motherboard Asrock X399M
Cooling CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory 4x16GB G.Skill 3600
Video Card(s) (2) EVGA SC BLACK 1080Ti's
Storage 2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s) Dell UP2414Q 3840X2160@60hz
Case Caselabs Mercury S5+pedestal
Audio Device(s) Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply Seasonic Prime 1200w
Mouse Thermaltake Theron, Steam controller
Keyboard Keychron K8
Software W10P

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.63/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
Let me show you the Intel Silvermont, C2000, eight cores architecture.
All new Atoms have modules, with 2 cores in it, sharing the same L2 cache. Are they liars too?
That's an octo-core so Intel is not lying. The compute cores aren't broken up at all--nothing is shared except L2 cache.

A "core" only requires data + instruction cache. Additional caches are added for boosting performance (decreasing the gaps in latency between core and system RAM).

up to 32k = L1
up to 256k = L2
up to 4M = L3
up to 64M = L4 eDRAM in 4950HQ, system RAM otherwise.

As I specified above, if a quad-core processor has 4 L2 caches, then those L2 caches are part of the core because it is not a shared resource. If the resource is shared (as is the case with Silvermont) then the resource doesn't belong to a core--it's part of the CPU package (like L3, QPI, HyperTransport, memory controller, etc. usually are).


Then the entire FPU is taken and leave no resources to the other integer core.
This blocking situation is never encountered on Silvermont nor Core 2 Duo. If a blocking situation is possible, I'd argue (and have argued) the whole of it is a multithreaded core, not multi-core.

A core can take an instruction and execute the whole of it without sharing any parts with any other processor. Bulldozer and sons, when executing a floating point unit task, do not fit that definition. Silvermont will happily execute eight 256-bit AVX instructions simultaneously across all cores, unlike an FX-8350. It'll do that with ANY instruction because none of the execution hardware is shared.
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
A core can take an instruction and execute the whole of it without sharing any parts with any other processor. Bulldozer and sons, when executing a floating point unit task, do not fit that definition. Silvermont will happily execute eight 256-bit AVX instructions simultaneously across all cores, unlike an FX-8350. It'll do that with ANY instruction because none of the execution hardware is shared.
...but the FPU isn't what did Bulldozer in, it was the reduction in the number of uOps per clock that could be accomplished by either the FPU or the integer cores. Fewer uOps per cycle means that if the bandwidth resources aren't available, full instructions could take more clock cycles to complete which could further harm performance by essentially stalling the pipeline due to these limited resources on each integer core. The net result is relatively garbage performance.

If you look at Intel, all they've been doing is beefing up their cores when it comes to uOp bandwidth.
 
Joined
Feb 8, 2012
Messages
3,013 (0.68/day)
Location
Zagreb, Croatia
System Name Windows 10 64-bit Core i7 6700
Processor Intel Core i7 6700
Motherboard Asus Z170M-PLUS
Cooling Corsair AIO
Memory 2 x 8 GB Kingston DDR4 2666
Video Card(s) Gigabyte NVIDIA GeForce GTX 1060 6GB
Storage Western Digital Caviar Blue 1 TB, Seagate Baracuda 1 TB
Display(s) Dell P2414H
Case Corsair Carbide Air 540
Audio Device(s) Realtek HD Audio
Power Supply Corsair TX v2 650W
Mouse Steelseries Sensei
Keyboard CM Storm Quickfire Pro, Cherry MX Reds
Software MS Windows 10 Pro 64-bit
If you look at Intel, all they've been doing is beefing up their cores when it comes to uOp bandwidth.
Additionally, let's not forget how late AMD is introducing uOp cache with Zen now, almost 6 years after Intel's Sandy Bridge ... I don't know how much, but absence of uOp cache in bulldozer should also contribute for lesser total net uOps/cycle
 
Last edited:

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.63/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
...but the FPU isn't what did Bulldozer in, it was the reduction in the number of uOps per clock that could be accomplished by either the FPU or the integer cores. Fewer uOps per cycle means that if the bandwidth resources aren't available, full instructions could take more clock cycles to complete which could further harm performance by essentially stalling the pipeline due to these limited resources on each integer core. The net result is relatively garbage performance.

If you look at Intel, all they've been doing is beefing up their cores when it comes to uOp bandwidth.
That's irrelevant. What is relevant is that if the FX-8350 had 8 FPUs (one to go with each integer core like a traditional core), it's multithreaded FPU performance would be better because there would no longer be any chance for blocking. The lawsuit is about AMD calling it an "8 core" processor when it is an "8 integer core" processor. AMD does not make that distinction on the box or in marketing material. It has mislead the public selling 4 multithreaded cores as 8. It would be akin to Intel calling the i7-6700 an "8 core" processor. It doesn't matter that AMD shored up the symmetrical multithreading in Bulldozer and sons with extra hardware for a performance boost. It's still a quad-core when you throw heavy FPU loads at it and they sold it as an eight-core.
 

Frick

Fishfaced Nincompoop
Joined
Feb 27, 2006
Messages
18,930 (2.85/day)
Location
Piteå
System Name Black MC in Tokyo
Processor Ryzen 5 5600
Motherboard Asrock B450M-HDV
Cooling Be Quiet! Pure Rock 2
Memory 2 x 16GB Kingston Fury 3400mhz
Video Card(s) XFX 6950XT Speedster MERC 319
Storage Kingston A400 240GB | WD Black SN750 2TB |WD Blue 1TB x 2 | Toshiba P300 2TB | Seagate Expansion 8TB
Display(s) Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case Fractal Design Define R4
Audio Device(s) Line6 UX1 + some headphones, Nektar SE61 keyboard
Power Supply Corsair RM850x v3
Mouse Logitech G602
Keyboard Cherry MX Board 1.0 TKL Brown
VR HMD Acer Mixed Reality Headset
Software Windows 10 Pro
Benchmark Scores Rimworld 4K ready!
That's irrelevant. What is relevant is that if the FX-8350 had 8 FPUs (one to go with each integer core like a traditional core), it's multithreaded FPU performance would be better because there would no longer be any chance for blocking. The lawsuit is about AMD calling it an "8 core" processor when it is an "8 integer core" processor. AMD does not make that distinction on the box or in marketing material. It has mislead the public selling 4 multithreaded cores as 8. It would be akin to Intel calling the i7-6700 an "8 core" processor. It doesn't matter that AMD shored up the symmetrical multithreading in Bulldozer and sons with extra hardware for a performance boost. It's still a quad-core when you throw heavy FPU loads at it and they sold it as an eight-core.

Is there a universal definition of an x86 core though? They could have handled it better, but I wouldn't say they were lying.
 

FordGT90Concept

"I go fast!1!11!1!"
Joined
Oct 13, 2008
Messages
26,259 (4.63/day)
Location
IA, USA
System Name BY-2021
Processor AMD Ryzen 7 5800X (65w eco profile)
Motherboard MSI B550 Gaming Plus
Cooling Scythe Mugen (rev 5)
Memory 2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s) AMD Radeon RX 7900 XT
Storage Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s) Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s) Realtek ALC1150, Micca OriGen+
Power Supply Enermax Platimax 850w
Mouse Nixeus REVEL-X
Keyboard Tesoro Excalibur
Software Windows 10 Home 64-bit
Benchmark Scores Faster than the tortoise; slower than the hare.
AMD pretty much established it with Athlon 64 X2 and Intel followed suit with Pentium D: two processors, one die. The only anomaly is Bulldozer and sons.

The only other modern exception which I believe @Aquinus pointed out earlier was SPARC processors for databases. In that case, the FPU is a practically a separate core (8:1 ratio) unto itself because databases usually don't have to deal with floating-point operations. If the cores encountered floating-point work, they'd farm it out to the floating-point core and wait for a response.
 
Last edited:

newtekie1

Semi-Retired Folder
Joined
Nov 22, 2005
Messages
28,472 (4.23/day)
Location
Indiana, USA
Processor Intel Core i7 10850K@5.2GHz
Motherboard AsRock Z470 Taichi
Cooling Corsair H115i Pro w/ Noctua NF-A14 Fans
Memory 32GB DDR4-3600
Video Card(s) RTX 2070 Super
Storage 500GB SX8200 Pro + 8TB with 1TB SSD Cache
Display(s) Acer Nitro VG280K 4K 28"
Case Fractal Design Define S
Audio Device(s) Onboard is good enough for me
Power Supply eVGA SuperNOVA 1000w G3
Software Windows 10 Pro x64
Is there a universal definition of an x86 core though? They could have handled it better, but I wouldn't say they were lying.

Simply, if it can execute all the instructions in the x86, or in this case x86_64 instruction set, then it is an x86_64 core. You don't need an FPU to execute any of the instruction in the basic x86_64 instruction set, it just helps performance greatly for some of them.

Intel followed suit with Pentium D: two processors, one die.

Yeah, the Pentium D wasn't two processor on one die...oh and the Core 2 Quad wasn't 4 processors on 1 die either.
 
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
13,147 (2.94/day)
Location
Concord, NH, USA
System Name Apollo
Processor Intel Core i9 9880H
Motherboard Some proprietary Apple thing.
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Logitech G915, GL Clicky
Software MacOS 12.1
I think the when push comes to push, the core count isn't really what people are pissed off about. This is all about the lackluster performance of these CPUs and I think that this is just a facade for that. No one ever said 8 cores had to be fast. :laugh:
 
Top