AMD Orochi ''Bulldozer'' Die Holds 16 MB Cache

btarunr · Sep 24, 2010

Documents related to the "Orochi" 8-core processor by AMD based on its next-generation Bulldozer architecture reveal its cache hierarchy that comes as a bit of a surprise. Earlier this month, at a GlobalFoundries hosted conference, AMD displayed the first die-shot of the Orochi die, which legibly showed key features including the four Bulldozer modules which hold two cores each, and large L2 caches. In coarse visual inspection, the L2 cache of each module seems to cover 35% of its area. L3 cache is located along the center of the die. The documents seen by X-bit Labs reveal that each Bulldozer module has its own 2 MB L2 cache shared between two cores, and an L3 cache shared between all four modules (8 cores) of 8 MB.

This takes the total cache count of Orochi all the way up to 16 MB. This hierarchy suggests that AMD wants to give individual cores access to a large amount of faster cache (that's a whopping 2048 KB compared to 512 KB per core on Phenom, and 256 KB per core on Core i7), which facilitates faster inter-core, intra-module communication. Inter-module communication is enhanced by the 8 MB L3 cache. Compared to the current "Istanbul" six-core K10-based die, that's a 77% increase in cache amount for a 33% core count increase, 300% increase in L2 cache per core. Orochi is built on a 32 nm GlobalFoundries process, it is sure to have a very high transistor count.

View at TechPowerUp Main Site

b82rez · Sep 24, 2010

BL GG Intel Fanboys. AMD is back! :nutkick:

bpgt64 · Sep 24, 2010

I'll believe it's a performance gain when I see the benchmarks. Regardless of which side you take, competition is always good for the consumer.

KainXS · Sep 24, 2010

wait for benchmarks before you start that, we've been through that before with amd

wolf · Sep 24, 2010

b82rez said:
BL GG Intel Fanboys. AMD is back!

silly

cache isn't everything, reviews pretty much are.

ebolamonkey3 · Sep 24, 2010

2011 is shaping up to be quite an interesting year

Completely Bonkers · Sep 24, 2010

I remember the "massive cache" Gallatin P4's over Northwood. Didn't make more than 5% difference clock for clock except in very special circumstances.

So let's wait for benchmarks.

I would have thought there would be better gains by rethinking cache and memory entirely, possibly producing a separate socket for L3 cache just like in the old days. It would be so much cheaper to do it that way, you could easily pack 256MB cache. Yes, the latency would be worse than current on-die L3 cache, but with the space, heat and transistors saved, you could bump up L1 and L2 cache and win back any performance losses. Plus you could build your L3 cache to order.

DaMulta · Sep 24, 2010

That's it????? I wait for the day with 16 cores with 64MB of Cache

dir_d · Sep 24, 2010

Well it seems Bulldozer is going to be faster when communicating with memory and other cores. I think if AMD just did that to a phenom 2 chip it would speed it up significantly. I really cant wait to see bulldozer in action.

bear jesus · Sep 24, 2010

I would hope more faster cache could be a good thing but the main thing im interested in is how each modual performs, i'm really thinking about getting a high end sandy bridge or bulldozer to last me a couple years or so and that means i want as many and as fast a cores as possible as i would hope over the next few years more software will use more cores.

Rebelstar · Sep 24, 2010

I'm totally noob in CPU technologies but I think 16MB cache it's a freaking cool, right?

xaira · Sep 24, 2010

btarunr said:
it is sure to have a very high transistor count.

so does fermi, i hope amd has the tdp under control, otherwise sandy will kick butt

bear jesus · Sep 24, 2010

Rebelstar said:
I'm totally noob in CPU technologies but I think 16MB cache it's a freaking cool, right?

It could be if put to use well but the core's are really importaint, either way we won't know untill the reviews really.

devguy · Sep 24, 2010

One design win I really commend AMD for is their use of dynamic cache allocation between the "cores" on a module. While many assume the sharing of cache (and other items like the FPU) will hurt single threaded performance, that really isn't the case. When only one core is active per module, it has complete control over all the resources; thus a single core will have 2mb L2 cache at its disposal! Also, when both cores on a module are active, they can inequitably share the resources (ie one core with .5mb L2 and another with 1.5mb L2 is possible). Very cool technology.

For Bulldozer, there will be the option to have the OS prefer loading one core per module (like cores 1, 3, 5, 7) rather than just filling them up by modules (1, 2, 3, 4). Both have benefits and faults: the first route has higher performance, but also higher power consumption; the second would be the exact opposite.

As far as the sharing of the FPU, in servers it will make hardly any difference. In the desktop segment, AMD argues that should you be doing something that takes up so much FPU performance to slow down our modules, then you should be doing it on the GPU instead.

cadaveca · Sep 24, 2010

I like this news. I ahve been saying for a couple of years now that AMD's cache design needed to cahnge, and here, they are doing something about it. That makes me even more interested in Bulldozer tech.

bear jesus · Sep 24, 2010

devguy said:
One design win I really commend AMD for is their use of dynamic cache allocation between the "cores" on a module. While many assume the sharing of cache (and other items like the FPU) will hurt single threaded performance, that really isn't the case. When only one core is active per module, it has complete control over all the resources; thus a single core will have 2mb L2 cache at its disposal! Also, when both cores on a module are active, they can inequitably share the resources (ie one core with .5mb L2 and another with 1.5mb L2 is possible). Very cool technology.

For Bulldozer, there will be the option to have the OS prefer loading one core per module (like cores 1, 3, 5, 7) rather than just filling them up by modules (1, 2, 3, 4). Both have benefits and faults: the first route has higher performance, but also higher power consumption; the second would be the exact opposite.

As far as the sharing of the FPU, in servers it will make hardly any difference. In the desktop segment, AMD argues that should you be doing something that takes up so much FPU performance to slow down our modules, then you should be doing it on the GPU instead.

I never knew it would be set up like that, kind of makes me even more sure i want to wait for bulldozer for my next full upgrade so that if it is a good cpu at a good price i can go for one or if not then i can get somethign from sandy bridge a little cheaper (hoping price drops will come over the time waited and if the consumer is lucky price drops that come with/after bulldozer).

cheezburger · Sep 24, 2010

no surprise. they are try to fix the single thread performance hit due to the smaller l1 data/instruction. each core "only" had 8kb l1 data while the instruction cache is share by module which just only 64kb "2 way" in cache(could have be less...i think...) which is roughly 40kb per core compare to core's 64kb per core. big disadvantage. so all they can do is add more l3 cache to increase the performance or hoping not drop performance without tweak too much on the exist architecture that had been tape out and going to be release in 3 months. same thing intel did when realized northwood its poor l1 cache will drag down performance they increase l2 cache from 256kb to 512kb. however orochi is 8 module 16 core processor so featuring 16mb l3 meant each core can use up to 1mb l3. still way below nehalem's 2mb per core. also unlike intel's architecture amd's cache heavily determine by the stage pipeline. lower stage pipeline won't take advantage on bigger cache. but since bulldozer will featuring 4+ghz i doubt this will be at least 20+ stage pipeline in this processor. but despite all these feature as long as intel decide to increase ivy bridge's l2 cache from 256k per core to 512k per core amd will experience same horror they faced when core 2 came out.

HTC · Sep 24, 2010

I wonder how hot these CPUs will get ...

ROad86 · Sep 24, 2010

cheezburger said:
no surprise. they are try to fix the single thread performance hit due to the smaller l1 data/instruction. each core "only" had 8kb l1 data while the instruction cache is share by module which just only 64kb "2 way" in cache(could have be less...i think...) which is roughly 40kb per core compare to core's 64kb per core. big disadvantage. so all they can do is add more l3 cache to increase the performance or hoping not drop performance without tweak too much on the exist architecture that had been tape out and going to be release in 3 months. same thing intel did when realized northwood its poor l1 cache will drag down performance they increase l2 cache from 256kb to 512kb. however orochi is 8 module 16 core processor so featuring 16mb l3 meant each core can use up to 1mb l3. still way below nehalem's 2mb per core. also unlike intel's architecture amd's cache heavily determine by the stage pipeline. lower stage pipeline won't take advantage on bigger cache. but since bulldozer will featuring 4+ghz i doubt this will be at least 20+ stage pipeline in this processor. but despite all these feature as long as intel decide to increase ivy bridge's l2 cache from 256k per core to 512k per core amd will experience same horror they faced when core 2 came out.

First orochi is 4 module - 8 core design. Second not only the size but how fast is the cache. Third it is very important how the prediction of instructions will work, if the design is good then you dont need big L1 cache which increase cost and die size. And yes 2mb per module 1 mb per core is the amount that bulldozer will have.

mechtech · Sep 24, 2010

I want one, a server version with 8 or 16 GB of ecc ram

I don't know why though since I don't even work 1 core on my 955BE

cadaveca · Sep 24, 2010

HTC said:
I wonder how hot these CPUs will get ...

Very hot...apparantly we'll see a clockspeed decrease(which I assume is due to the high levels of cache), but IPC will increase. I'm kinda expecting 2.4ghz or so...maybe lower...for launch chips.

bear jesus · Sep 24, 2010

cadaveca said:
Very hot...apparantly we'll see a clockspeed decrease(which I assume is due to the high levels of cache), but IPC will increase. I'm kinda expecting 2.4ghz or so...maybe lower...for launch chips.

Just a good reason for me to get my first real water cooling setup

(assuming i am happy with the reviews of bulldozer)

cadaveca · Sep 24, 2010

I don't know anything about it, really. However, there is mention of the clockspeed decrease on the AMD blog site. NOw that we have the info on cache size...1+1=2. Of course, there's lots of time between now and launch..seems to me they are refining the process, and a few bugs, at this point.

ROad86 · Sep 24, 2010

mechtech said:
I want one, a server version with 8 or 16 GB of ecc ram I don't know why though since I don't even work 1 core on my 955BE

Haha me too!!! :laugh:

bear jesus · Sep 24, 2010

cadaveca said:
I don't know anything about it, really. However, there is mention of the clockspeed decrease on the AMD blog site. NOw that we have the info on cache size...1+1=2. Of course, there's lots of time between now and launch..seems to me they are refining the process, and a few bugs, at this point.

Hmm i wonder if they will follow intel's lead (refering to the cooler that comes with the top end i7's) by using a better cooler for the high end cpu's if they run hot, would be nice to see a better cooler than the current one's as i am not really a fan of them.

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	B82REZ-2010
Processor	AMD 1090T at 4ghz - 1.425v
Motherboard	ASrock 890GX Extreme 3
Cooling	Corsair H50 - Push/Pull with Noctua NF-P12's Connected Via Zalman Fan Controller
Memory	8GB Corsair Dominator's 1600mhz (Dual Memory Config)
Video Card(s)	Asus 5850
Storage	60GB Corsair SSD, 2x 1TB, 1x 750GB, 5x 2TB
Display(s)	23" Full HD LG LCD, 19" Acer LCD
Case	Cooler Master HAF X
Audio Device(s)	Onboard
Power Supply	Corsair 850HX Modular PSU
Software	Windows 7 Ultimate 64-Bit

System Name	My Rig
Processor	AMD 3950X
Motherboard	X570 TUFF GAMING PLUS
Cooling	EKWB Custom Loop, Lian Li 011 G1 distroplate/DDC 3.1 combo
Memory	4x16GB Corsair DDR4-3466
Video Card(s)	MSI Seahawk 2080 Ti EKWB block
Storage	2TB Auros NVMe Drive
Display(s)	Asus P27UQ
Case	Lian Li 011-Dynamic XL
Audio Device(s)	JBL 30X
Power Supply	Seasonic Titanium 1000W
Mouse	Razer Lancehead
Keyboard	Razer Widow Maker Keyboard
Software	Window's 10 Pro

Processor	AMD Ryzen 9 9950x / AMD Epyc 7773x
Motherboard	Gigabyte B850 Gaming X/ ASROCK ROME
Cooling	Be Quiet Dark Rock Pro 4(Custom) / Custom Air
Memory	64GB Crucial Pro 6400 / 384GB
Video Card(s)	MSI RTX5070Ti(Temporary)/ 4X RTX3090
Storage	Adata SX8200 1TB NVME/WD Black 1TB NVME
Display(s)	Dell 27 Inch 165Hz
Case	Lian Li A3 Mini
Audio Device(s)	IFI Zen Dac/JDS Labs Atom+/SMSL Amp+Rivers Audio
Power Supply	Corsair RM850x
Mouse	Logitech G502 SE Hero
Keyboard	Corsair K70 RGB Mk.2
VR HMD	Samsung Odyssey Plus/ Quest 3
Software	Windows 11

System Name	MightyX
Processor	Ryzen 9800X3D
Motherboard	Gigabyte B650I AX
Cooling	Scythe Fuma 2
Memory	32GB DDR5 6000 CL30 tuned
Video Card(s)	Palit Gamerock RTX 5080 oc
Storage	WD Black SN850X 2TB
Display(s)	LG 42C2 4K OLED
Case	Coolermaster NR200P
Audio Device(s)	LG SN5Y / Focal Clear
Power Supply	Corsair SF750 Platinum
Mouse	Corsair Dark Core RBG Pro SE
Keyboard	Glorious GMMK Compact w/pudding
VR HMD	Meta Quest 3
Software	case populated with Artic P12's
Benchmark Scores	4k120 OLED Gsync bliss

System Name	Norbert
Processor	Intel Core i7 920
Motherboard	Gigabyte X58A-UD5
Cooling	Corsair H50 with 2x Scythe GT AP-14
Memory	3x 2gb G.Skill 1600Mhz C9 DDR3
Video Card(s)	MSI Twin Frozr II GTX 465 GE & EVGA GTS 450 SC
Storage	2x 1Tb Samsung Sprinpoint F3 7200rpm
Display(s)	Dell U3011, Dell 2408WFP, Samsung 2693HM
Case	Lian Li V1020R
Audio Device(s)	Creative X-Fi Titanium
Power Supply	Seasonic X-750
Software	Windows 7 Ultimate 64bit

Processor	Mysterious Engineering Prototype
Motherboard	Intel 865
Cooling	Custom block made in workshop
Memory	Corsair XMS 2GB
Video Card(s)	FireGL X3-256
Display(s)	1600x1200 SyncMaster x 2 = 3200x1200
Software	Windows 2003

System Name	Work in progress
Processor	AMD 955---4Ghz
Motherboard	MSi GD70
Cooling	OcZ Phase/water
Memory	Crucial2GB kit (1GBx2), Ballistix 240-pin DIMM, DDR3 PC3-16000
Video Card(s)	CrossfireX 2 X HD 4890 1GB OCed to 1000Mhz
Storage	SSD 64GB
Display(s)	Envision 24'' 1920x1200
Case	Using the desk ATM
Audio Device(s)	Sucky onboard for now :(
Power Supply	1000W TruePower Quattro

System Name	4k
Processor	AMD 5800x3D
Motherboard	MSI MAG b550m Mortar Wifi
Cooling	ARCTIC Liquid Freezer II 240
Memory	4x8Gb Crucial Ballistix 3600 CL16 bl8g36c16u4b.m8fe1
Video Card(s)	Nvidia Reference 3080Ti
Storage	ADATA XPG SX8200 Pro 1TB
Display(s)	LG 48" C1
Case	CORSAIR Carbide AIR 240 Micro-ATX
Audio Device(s)	Asus Xonar STX
Power Supply	EVGA SuperNOVA 650W
Software	Microsoft Windows10 Pro x64

System Name	Gaming temp// HTPC
Processor	AMD A6 5400k // A4 5300
Motherboard	ASRock FM2A75 PRO4// ASRock FM2A55M-DGS
Cooling	Xigmatek HDT-D1284 // stock phenom II HSF
Memory	4GB 1600mhz corsair vengeance // 4GB 1600mhz corsair vengeance low profile
Storage	64gb sandisk pulse SSD and 500gb HDD // 500gb HDD
Display(s)	acer 22" 1680x1050
Power Supply	Seasonic G-450 // Corsair CXM 430W

Display(s)	custom thin bezel eyefinity 1x3 portrait
Case	Cooler Master HAF 932
Power Supply	CoolerMaster SilentPro M850
Software	Windows 7 Enterprise x64

Processor	Xeon E5 1650 V4
Motherboard	MSI X99A SLI PLUS
Cooling	HYPER 212 EVO
Memory	64gb DDR4 2133
Video Card(s)	XFX RADEON RX 480 8GB
Storage	Samsung PM951 512GB NVMe SSD
Display(s)	LG 34" Ultrawide + AOC 27"
Power Supply	EVGA 750 Watt
Mouse	Logitech M280
Keyboard	Dell SK-8135

Processor	AMD Phenom II 1055T @ 3.6ghz 1.3V
Motherboard	Asus M5A97 EVO
Cooling	Xigmatek SD1284
Memory	2x4GB Patriot Sector 5 PC3-12800 @ 7-8-7-24-1T 1.7V
Video Card(s)	XFX Radeon HD 7950 DD @ 1100/1350 1.185V
Storage	OCZ Agility 3 120GB + 2x7200.12 500GB Raid1
Display(s)	QNIX QX2710 27" LCD 1440p @ 120hz
Case	Cooler Master 690M
Audio Device(s)	Realtek ALC892
Power Supply	Enermax Liberty 620W Eco Edition
Software	Windows 7 Professional x64 / Ubuntu 12.04 x64

System Name	no bases
Processor	E8400/e5300/qx9770
Motherboard	rampage formula/DG41TY/p5q DELUXE
Cooling	stock DTC cooler&copper core
Memory	titanium XTC DDR2 800 2gbx4/2gbx2/ballistix 2GBx4 DDR2-800
Video Card(s)	evga gtx 460 oc/zotac 9600gt amp/evga gtx 580
Storage	WD cavior black 2TB 16mb eSATA 2/500gb 16mb ATA133/ OCZSSD2-1ONX32G + samsung 320gb 8mb ESATA
Case	cm 690/GZ-x2/antec qaudro 1200w
Power Supply	antec quattro 1200w/zumax 500w v2/antec HCG 900w
Software	windows server 2008 sp2/windows xp x64 pro sp2c/windows server 2008 sp1

System Name	HTC's System
Processor	Ryzen 5 5800X3D
Motherboard	Asrock Taichi X370
Cooling	NH-C14, with the AM4 mounting kit
Memory	G.Skill Kit 16GB DDR4 F4 - 3200 C16D - 16 GTZB
Video Card(s)	Sapphire Pulse 6600 8 GB
Storage	1 Samsung NVMe 960 EVO 250 GB + 1 3.5" Seagate IronWolf Pro 6TB 7200RPM 256MB SATA III
Display(s)	LG 27UD58
Case	Fractal Design Define R6 USB-C
Audio Device(s)	Onboard
Power Supply	Corsair TX 850M 80+ Gold
Mouse	Razer Deathadder Elite
Software	Ubuntu 20.04.6 LTS

Processor	Ryzen 5700x
Motherboard	Gigabyte X570S Aero G R1.1 BiosF5g
Cooling	Noctua NH-C12P SE14 w/ NF-A15 HS-PWM Fan 1500rpm
Memory	Micron DDR4-3200 2x32GB D.S. D.R. (CT2K32G4DFD832A)
Video Card(s)	AMD RX 6800 - Asus Tuf
Storage	Kingston KC3000 1TB & 2TB & 4TB Corsair MP600 Pro LPX
Display(s)	LG 27UL550-W (27" 4k)
Case	Be Quiet Pure Base 600 (no window)
Audio Device(s)	Realtek ALC1220-VB
Power Supply	SuperFlower Leadex V Gold Pro 850W ATX Ver2.52
Mouse	Mionix Naos Pro
Keyboard	Corsair Strafe with browns
Software	W10 22H2 Pro x64

Processor	AMD Phenom II x4 B55
Motherboard	Gigabyte MA790XT-UD4P
Cooling	SilverStone Nitrogon NT06 Evolution+Noiseblocker BlackSilentPro
Memory	Corsair XMS3 4GB
Video Card(s)	Saphire Radeon 4870
Storage	WD 640 Black + WD 500 Blue
Case	Antec P193
Power Supply	Corsair CMPSU-650TX
Software	Win 7 Professional 64bit

AMD Orochi ''Bulldozer'' Die Holds 16 MB Cache

Editor & Senior Moderator

Better Than Native

New Member

New Member

My stars went supernova

New Member

New Member

New Member

My name is Dave

New Member

New Member

New Member

My name is Dave

New Member

My name is Dave

New Member

New Member