Friday, September 24th 2010

AMD Orochi ''Bulldozer'' Die Holds 16 MB Cache

Documents related to the "Orochi" 8-core processor by AMD based on its next-generation Bulldozer architecture reveal its cache hierarchy that comes as a bit of a surprise. Earlier this month, at a GlobalFoundries hosted conference, AMD displayed the first die-shot of the Orochi die, which legibly showed key features including the four Bulldozer modules which hold two cores each, and large L2 caches. In coarse visual inspection, the L2 cache of each module seems to cover 35% of its area. L3 cache is located along the center of the die. The documents seen by X-bit Labs reveal that each Bulldozer module has its own 2 MB L2 cache shared between two cores, and an L3 cache shared between all four modules (8 cores) of 8 MB.

This takes the total cache count of Orochi all the way up to 16 MB. This hierarchy suggests that AMD wants to give individual cores access to a large amount of faster cache (that's a whopping 2048 KB compared to 512 KB per core on Phenom, and 256 KB per core on Core i7), which facilitates faster inter-core, intra-module communication. Inter-module communication is enhanced by the 8 MB L3 cache. Compared to the current "Istanbul" six-core K10-based die, that's a 77% increase in cache amount for a 33% core count increase, 300% increase in L2 cache per core. Orochi is built on a 32 nm GlobalFoundries process, it is sure to have a very high transistor count.Source: Xbit Labs
Add your own comment

152 Comments on AMD Orochi ''Bulldozer'' Die Holds 16 MB Cache

#1
bear jesus
cheezburger said:
do you believe there source from wiki? second there's still many people rather believe cayman will be a 256bit bus with ridiculously high frequency GDDR5 and 32rops with also ridiculous number of old style 5D shader...also there are many romur as well. first i did believe cayman is going to be a top end card with 512bit bus and 64rops..along with mighty bulldozer....but after seen specification of orochi i starting doubt amd did not change their mainstream strategy at all...which is what i afraid. bulldozer's small core just reminding me of 5D shader on old r600....small core just make bulldozer don't look that mighty anymore...

also are you sure that i only up on amd/ati thread? go see on fermi thread...i made a lot of criticism about fermi over there and i believe you should take a look.... and stop calling me troll unless you can prove me wrong but then again that is personal offense.
Honestly i don't believe any unofficial source i was just coommenting on your posts, i have not given fermi threads much attetion as i'm waiting to choose what card i will want and have read enough on the fermi reviews to know what i need for now.

I am sorry if it seamed like i was calling you a troll, i was not and that was not my intention.
Posted on Reply
#2
LAN_deRf_HA
Am I right in thinking bulldozer was designed with more server than consumer considerations? I'm concerned this will be like fermi. Not the most efficient chip for the average consumer because it includes things aimed at the business sector, all in one blanket package.
Posted on Reply
#3
erocker
LAN_deRf_HA said:
Am I right in thinking bulldozer was designed with more server than consumer considerations? I'm concerned this will be like fermi. Not the most efficient chip for the average consumer because it includes things aimed at the business sector, all in one blanket package.
Through reading this thread and getting whatever information I can wrap my head around, I don't see the correlation at all. If anything, when it was mentioned, things like power consumption were lower than the competition so I have no idea where you're getting that idea from. :confused:
Posted on Reply
#4
Techtu
erocker said:
Through reading this thread and getting whatever information I can wrap my head around, I don't see the correlation at all. If anything, when it was mentioned, things like power consumption were lower than the competition so I have no idea where you're getting that idea from. :confused:
Maybe not in this thread but here's a little something from one of btarunr's thread's...


btarunr said:
Market Segments
As mentioned in the graphic before, AMD's modular design allows it to create different products by simply controlling the number of modules on the die (by whichever method). With this, AMD will have processors ready with most PC and server market segments, all the way from desktop PCs, enthusiast-grade PCs, notebooks, to servers. AMD expects to have a full-fledged lineup in 2011. The first Bulldozer CPUs will be sold to the server market.
The whole thread can be found here

Anyway's maybe from hearing something like that has lead to the thought that this is going to be based on servers.
Posted on Reply
#5
erocker
Tech2 said:
Maybe not in this thread but here's a little something from one of btarunr's thread's...




The whole thread can be found here

Anyway's maybe from hearing something like that has lead to the thought that this is going to be based on servers.
That's the way it's been for years. Opterons had their "Athlon/Phenom" equivilent as Xeons and Intel Desktop chips are the same. I don't see a change here.
Posted on Reply
#6
Techtu
erocker said:
That's the way it's been for years. Opterons had their "Athlon/Phenom" equivilent as Xeons and Intel Desktop chips are the same. I don't see a change here.
I know this :toast: ... I actually read about it a short while ago whilst trying to find that thread of btarunr's :)
Posted on Reply
#7
EastCoasthandle
JF-AMD
I curious to know if Bulldozer solutions would provide any substantial boost in performance when paired with a 6000 series card over a Intel solution? I'm not sure if you can comment on such a thing though..
Posted on Reply
#8
bear jesus
EastCoasthandle said:
JF-AMD
I curious to know if Bulldozer solutions would provide any substantial boost in performance when paired with a 6000 series card over a Intel solution? I'm not sure if you can comment on such a thing though..
I have to ask, in theory is not not like asking if there would be a boost using a 5000 series card with a phenom over an intel solution. should it not just depend on what cpu is more powerful but then mainly in cpu limited games/situations?
Posted on Reply
#9
crazyeyesreaper
Chief Broken Rig
well it depends clock speed wise Phenom II is more then adquate for games but with AMDs display drivers after 10.4a performance tanks like a mofo with stock NB so 4ghz cpu 2000nb is slower then 3400 stock cpu 2600nb basically 5k cards on Phenom II with a slow NB performance scaling can be negative aka 2 gpus at 50% = 1 gpu at 99% so that second card basically does nothing on newer drivers on 10.4a tho everything works fine so no one really knows for certain right now
Posted on Reply
#10
HXL492
I hope the bulldozer will come through. The excitement in me is unbearable...
Posted on Reply
#11
Makaveli
btarunr said:
There's nothing particularly bad with AMD's storage performance with a proper mode (AHCI or RAID) and proper driver (AMD over Microsoft) installed. The RAID controller sucked only till SB600 southbridge (which had a Silicon Image logic that wasn't implemented so well). SB700/SB710/SB750 is on par with ICH10/R, SB850 has no match (SATA 6 Gb/s).
I find this interesting.

Where can I see some reviews and or benchmarks of this, because from what i've seen on the web so far ICH10/R has superior performance to any of AMD's southbridges.

That also includes faster USB 2.0 speeds on intel chipsets aswell.
Posted on Reply
#12
bear jesus
crazyeyesreaper said:
well it depends clock speed wise Phenom II is more then adquate for games but with AMDs display drivers after 10.4a performance tanks like a mofo with stock NB so 4ghz cpu 2000nb is slower then 3400 stock cpu 2600nb basically 5k cards on Phenom II with a slow NB performance scaling can be negative aka 2 gpus at 50% = 1 gpu at 99% so that second card basically does nothing on newer drivers on 10.4a tho everything works fine so no one really knows for certain right now
I had no idea about that, i just assumed that it was another screw up by the ati driver team as i geuss i'm used to hearing about people wih crossfire performance problems after upgrading drivers (one of the big reasons i don't want to go back to a dual card setup). i geuss i'm lucky i didnt intend to crossfire on this board as the NB does not seam to like being pushed that far.

I must admit i hope the 6xxx and 7xxx cards go well with bulldozer as i'm hoping all of them meet my needs and budget over the next year or so.
Posted on Reply
#13
Depth
ebolamonkey3 said:
2011 is shaping up to be quite an interesting year :)
We say that at the end of every year :)
Posted on Reply
#14
Hayder_Master
first real good news from AMD, make me fell they maybe beat intel this time
Posted on Reply
#15
de.das.dude
Pro Indian Modder
they always do beat intel for me.
Posted on Reply
#16
pantherx12
Thanks for confirming new socket JF!

Was going to get an am3 set up for x-mas but I'll hold on to my dying intel set up til bulldozer is out.


Feel free to send me AM3+ set up for free for testing and review purposes! XD
Posted on Reply
#17
JF-AMD
AMD Rep (Server)
EastCoasthandle said:
JF-AMD
I curious to know if Bulldozer solutions would provide any substantial boost in performance when paired with a 6000 series card over a Intel solution? I'm not sure if you can comment on such a thing though..
I can't comment on any client info. I work in servers. I know a lot of the answers but because I am on the forum as an individual and not a company representative, I can't share info on that.

As to whether the bulldozer core was designed as a client or a server core, someone else had it right. We have leveraged the same core for both products in the past and will continue to do so (Istanbul/Thuban was the only recent departure).

There are client features turned off in servers and vice versa.

Typically because it is used for both people assume that single threaded performance and clock speed are not going to be good. I would not worry about either of those.
Posted on Reply
#18
bear jesus
JF-AMD said:
We have leveraged the same core for both products in the past and will continue to do so (Istanbul/Thuban was the only recent departure).

There are client features turned off in servers and vice versa.

Typically because it is used for both people assume that single threaded performance and clock speed are not going to be good. I would not worry about either of those.
Are barcelona and deneb quite close? i thought thuben and istanbul were kind of close but of corse some more differences, i assumed the server and client chips were pretty simmilar to each other excluding socket, the dual channel memory for client and quad channel memory for server, with of corse one major feature (turbo core) not a part of the server chips and as always different clock speeds and tdp.

Although i admit that i really don't know much about the server chips/differences between them and client chips and if there is any difference with the chips designed for say 1U upto 4U, is there anything you could say to give a little insight into the differences?
Posted on Reply
#19
JF-AMD
AMD Rep (Server)
Barcelona was very close to whatever that product was (I don't really know many of their code names).

We used the same die for 1000 through 8000 series. That is why we consolidated the line down to 4000/6000 and removed the 4P price premium. Today you can buy a top end AMD 2P or 4P for the same price, $1386, or you could buy the Intel 2P at $1663 and the 4P at ~$3682. Trust me, their 4P is not 2.5X faster than the 2P.

Plus, by having the same core top to bottom, when you write software and customize it, you aren't dealing with 3 different platforms, only 1. Even our chipsets are identical, top to bottom. Software people love that, network administrators love that, and if you are doing virtualization it makes things so much easier because you can easily move VMs around.

On the intel side their 4P is old technology and generally lags by a year from 2P. 3 different platforms, 3 different chipsets, lots of inconsistencies.
Posted on Reply
#20
bear jesus
JF-AMD said:
Barcelona was very close to whatever that product was (I don't really know many of their code names).

We used the same die for 1000 through 8000 series. That is why we consolidated the line down to 4000/6000 and removed the 4P price premium. Today you can buy a top end AMD 2P or 4P for the same price, $1386, or you could buy the Intel 2P at $1663 and the 4P at ~$3682. Trust me, their 4P is not 2.5X faster than the 2P.

Plus, by having the same core top to bottom, when you write software and customize it, you aren't dealing with 3 different platforms, only 1. Even our chipsets are identical, top to bottom. Software people love that, network administrators love that, and if you are doing virtualization it makes things so much easier because you can easily move VMs around.

On the intel side their 4P is old technology and generally lags by a year from 2P. 3 different platforms, 3 different chipsets, lots of inconsistencies.
I had no idea that the chipsets were the same, for some reason i thought that may be one of the changed between them, although not sure where i got that idea from.

I'm intending to learn a lot more about the server side as i intend to buy my first real home made server instead of just making client hardware act as a server with missing server features(although i only really know of eec as one of the major things im missing). it is one of the reasons i'm so interested in bulldozer as with interlagos chips i am hoping to build a very powerful server to last me as long as the hardware does. But of corse this wont be any time soon as i am not expecting it to come cheap.
Posted on Reply
#21
wahdangun
bear jesus said:
I had no idea that the chipsets were the same, for some reason i thought that may be one of the changed between them, although not sure where i got that idea from.

I'm intending to learn a lot more about the server side as i intend to buy my first real home made server instead of just making client hardware act as a server with missing server features(although i only really know of eec as one of the major things im missing). it is one of the reasons i'm so interested in bulldozer as with interlagos chips i am hoping to build a very powerful server to last me as long as the hardware does. But of corse this wont be any time soon as i am not expecting it to come cheap.
hmm aside from ECC ram support i know that server chip have more consistent throughput and can handle intense workload with stable performance where client chip usually focus to burst and not consistent
Posted on Reply
#22
bear jesus
wahdangun said:
hmm aside from ECC ram support i know that server chip have more consistent throughput and can handle intense workload with stable performance where client chip usually focus to burst and not consistent
That sounds a lot more useful as one of the main uses of my "server" pc (now and the planned upgrade) is hosting (mainly steam/source baised) games but also hosting files for remote access/downloading but i admit i would like to try some folding in it's spare time or with enough core's at the same time.
Posted on Reply
#23
wahdangun
bear jesus said:
That sounds a lot more useful as one of the main uses of my "server" pc (now and the planned upgrade) is hosting (mainly steam/source baised) games but also hosting files for remote access/downloading but i admit i would like to try some folding in it's spare time or with enough core's at the same time.
so why don't you just buy magny course CPU ? its have 12 core and quad chanel ram, or even better use 4P system for 48 cores, so you won't run out cpu core lol
Posted on Reply
#24
bear jesus
wahdangun said:
so why don't you just buy magny course CPU ? its have 12 core and quad chanel ram, or even better use 4P system for 48 cores, so you won't run out cpu core lol
Admitdly magny cours would be nice but i want to know if the difference between it and interlagos will be worth it as i would expect higher speed ram, better ipc, more cores, better power saving and hopefully other features

I expect it to be more expensive (even more so with fast high density ecc ddr3) but then of corse it will all be down to if the extra cost is worth the extra features and speed and to be honest i am hoping it will be, plus i am in no hurry to spend several thousand £ on a server that wont be fully used for near a year :laugh:
Posted on Reply
#25
demonkevy666
JF-AMD said:
Since 2 cores are sharing the same instructions much of the time, it can actually be 64K, right? Instruction caches are far less random than data caches.
Hint hint......
Posted on Reply
Add your own comment