Tuesday, August 24th 2010

AMD Details Bulldozer Processor Architecture

AMD is finally going to embrace a truly next generation x86 processor architecture that is built from ground up. AMD's current architecture, the K10(.5) "Stars" is an evolution of the more market-successful K8 architecture, but it didn't face the kind of market success as it was overshadowed by competing Intel architectures. AMD codenamed its latest design "Bulldozer", and it features an x86 core design that is radically different from anything we've seen from either processor giants. With this design, AMD thinks it can outdo both HyperThreading and Multi-Core approaches to parallelism, in one shot, as well as "bulldoze" through serial workloads with a broad 8 integer pipeline per core, (compared to 3 on K10, and 4 on Westmere). Two almost-individual blocks of integer processing units share a common floating point unit with two 128-bit FMACs.

AMD is also working on a multi-threading technology of its own to rival Intel's HyperThreading, that exploits Bulldozer's branched integer processing backed by shared floating point design, which AMD believes to be so efficient, that each SMT worker thread can be deemed a core in its own merit, and further be backed by competing threads per "core". AMD is working on another micro-architecture codenamed "Bobcat", which is a downscale implementation of Bulldozer, with which it will take on low-power and high performance per Watt segments that extend from all-in-One PCs all the way down to hand-held devices and 8-inch tablets. We will explore the Bulldozer architecture in some detail.

Bulldozer: The Turbo Diesel Engine
In many respects, the Bulldozer architecture is comparable to a diesel engine. Lower RPM (clock-speeds), high torque (instructions per second). When implemented, Bulldozer-based processors could outperform competing processor architectures at much lower clock speeds, due to one critical area AMD seems to have finally addressed: instructions per clock (IPC), unlike with the 65 nm "Barcelona" or 45 nm "Shanghai" architectures that upped IPC synthetically by using other means (such as backing the cores up with a level-3 cache, upping the uncore/northbridge clock speeds), the 32 nm Bulldozer actually features a broad integer unit with eight integer pipelines split into two portions, each portion having its own scheduler and L1 Data cache.



Parallelism: A Radical Approach?
Back when analysts were pinning high hopes on the Barcelona architecture, their hopes were fueled by early reports suggesting that AMD was using wide 128-bit wide floating point units, leading analysts to believe that AMD may have conquered its biggest nemesis - floating point performance, in turn its pure math crunching abilities. However, that wasn't exactly to be. That's because the processor's overall number crunching abilities were pegged to its floating point performance, ignoring the integer units.



AMD split 8 integers per core into two blocks, each block having four integer pipelines, an integer scheduler for those, and an L1 Data cache. These constitute the lowest level of "dedicated components", dedicated to processor threads. There is a shared floating point unit between the two, with two 128-bit FMACs, arbitrated by a floating point scheduler. The Fetch/Decode, an L2 cache, and the FPU constitute "shared" components.



AMD is implementing a simultaneous multithreading (SMT) technology, it can split each of the "dedicated" components (in this case, the integer unit) to deal with a thread of its own, while sharing certain components with the other integer unit, and effectively make each set of dedicated components a "core" in its own merit of efficiency. This way, the actual core of the Bulldozer die is deemed a "module", a superlative of two cores, and the Bulldozer die (chip) features n-number of modules depending on the model.
So now you have a chip with eight cores with much lower die sizes and transistor counts compared to a hypothetical 32 nm K10 8-core processor. It is unclear whether AMD wants to further push down SMT to the "core" level and run two threads simultaneously over dedicated components, but one thing for sure is that AMD has embraced SMT in some form or another. In all this, the chip-level parallelism is transparent to the operating system, it will only see a fixed number of logical processors, without any special software or driver requirement.

So in one go, AMD shot up its integer performance. Either a thread makes use of one integer unit with its four pipelines, or deals with both the integer units arbitrated by the fetch/decode, and the shared FPU.

Outside the modules
At the chip-level, there's a large L3 cache, a northbridge that integrates the PCI-Express root complex, and an integrated memory controller. Since the northbridge is completely on the chip, the processor does not need to deal with the rest of the system with a HyperTransport link. It connects to the chipset (which is now relegated to a southbridge, much like Intel's Ibex Peak), using A-Link Express, which like DMI, is essentially a PCI-Express link. It is important to note that all modules and extra-modular components are present on the same piece of silicon die. Because of this design change, Bulldozer processors will come in totally new packages that are not backwards compatible with older AMD sockets such as AM3 or AM2(+).
Expectations
Not surprisingly, AMD isn't talking about Bulldozer as the next big thing since dual-core processors (something it did with Barcelona). AMD currently does have an 8-core and 12-core processors codenamed "Magny-Cours", which are multichip modules of Shanghai (4-core) and Istanbul (6-core) dies. AMD expects an 8-core Bulldozer implementation (built with four modules), to have 50% higher performance-per-watt compared to Magny-Cours.



Market Segments
As mentioned in the graphic before, AMD's modular design allows it to create different products by simply controlling the number of modules on the die (by whichever method). With this, AMD will have processors ready with most PC and server market segments, all the way from desktop PCs, enthusiast-grade PCs, notebooks, to servers. AMD expects to have a full-fledged lineup in 2011. The first Bulldozer CPUs will be sold to the server market.


Hotchips 22 Presentation by AMD on the Bobcat Architecture
Below are as-is slides from AMD's Hotchips presentation on the Bobcat architecture.
Add your own comment

283 Comments on AMD Details Bulldozer Processor Architecture

#1
Steevo
I'm ready for a new board so who cares, I have been through a 9850, and would like to offload my 940 to my parents and and get a X6 if asus will ever pull their heads out of their ass and get a BIOS done.


I didn't expect this board to actually last this long for me, and I could drop in a higher end X4 part and still be happy. I have been building AMD for the last few years now and the fact that I can still get chips to fit older boards for a cheap performance upgrade, and or a board to fit a older chip is amazing from my days as a Intel man.

I have quite a few Intel chips at home from when a board dies, but nothign to do with them and they are now worthless as the boards don't work. I still have a S939 board and chip at home.
Posted on Reply
#2
Bloodcrazz
i was reading over this and people are like live in a dream world.
$300 amd beating $1000 intel. well if it did beat it wouldn't be $300 rofl.
dont think people remember the Althon FX and unlike intel extreme new one use to come out every few months
Posted on Reply
#3
crazyeyesreaper
Chief Broken Rig
actually $300 cpu does beat a $1000 cpu on occassion granted its OC vs stock but the major point here is your on an enthusiast site we dont by parts of ridiculous high end spectrums all the time many buy the best bang for buck and oc the shit out of it. example why would you buy a 980x if a 920oced gives u the same performance for 700 less granted that 980 when overclocked will walk away yes but at $700 premium does it make sense to get it no it dosent not for most.

its why AMD is still around today they offer good enough and close enough at a lower price they dont beat intels counterparts very often but they put up a damn good show of it and a 1090T at 4ghz+ is a damn good chip for $300.

And sure we remember the athlon FX but it was out in a time when intel was still shitting out P4s and AMD had the performance crown. having the crown means you can charge more money for your shit especially if you have good marketing :toast:
Posted on Reply
#4
CDdude55
Crazy 4 TPU!!!
by: crazyeyesreaper
actually $300 cpu does beat a $1000 cpu on occassion granted its OC vs stock but the major point here is your on an enthusiast site we dont by parts of ridiculous high end spectrums all the time many buy the best bang for buck and oc the shit out of it. example why would you buy a 980x if a 920oced gives u the same performance for 700 less granted that 980 when overclocked will walk away yes but at $700 premium does it make sense to get it no it dosent not for most.

its why AMD is still around today they offer good enough and close enough at a lower price they dont beat intels counterparts very often but they put up a damn good show of it and a 1090T at 4ghz+ is a damn good chip for $300.

And sure we remember the athlon FX but it was out in a time when intel was still shitting out P4s and AMD had the performance crown. having the crown means you can charge more money for your shit especially if you have good marketing :toast:
A lot depends on the architecture actually, a stock 980x still kicks the shit out of a 1090T at 4Ghz, and it still kills an overclocked 920. Then again, this was proven only during programs that were actually multithreaded. In the programs that weren't mutithreaded, the 920 and 980x generally perform exactly the same, with AMD still slightly lagging behind (but with the better price). AMD is still around today because yes, they provide cheaper chips that give you enough performance most of the time. Intels recent chips are very powerful, but yes the price is a bit higher, then again, how higher?. I mean a 1090T is $296 on newegg and an i7 930 is actually cheaper, sitting at $290... and you get better performance. Then again, one could debate that the surrounding parts including the motherboard could be more expensive. Then again, if you have spend almost $300 on a CPU, why not get a nice mobo to go along with it?.

It depends what you want, AMD definitely offers the best bang for buck, but if you're trying to reap all out performance, where that's where Intel shines. In multithreaded games and benchmarks, nothing can touch the 980x(especially when overclocked). The current AMD CPU's will always give you good enough performance. And whether or not that's what you want depends on the person.

If Bulldozer is cheap and can actually beat an i7 this time, then i'll be moving to that platform.
Posted on Reply
#5
crazyeyesreaper
Chief Broken Rig
well thats my point most apps still dont use more then 2-4 threads Intel has better clock to clock performance its cant be denied but major selling point for amd is the fact a 1090T can be dropped into a previous gen cfx or sli mobo that costs $100 where as that i7 930 is forced to be paired with a x58 and im just going with new prices not what we can find used that and triple channel is higher priced and as the 1156 socket proved triple channel is uneeded for i7. as the 750 and 860 tend to run neck and neck with the 920 in alot of situations and surprisingly the 750 tends to do slightly better in games with the same gpu on occasion as well which is rather interesting

http://www.anandtech.com/bench/Product/146?vs=46

as shown here 1090T vs i7 940 both trade blows off and on and the i7s only dominace comes in Far Cry 2 and let me tell you its a LANDSLIDE in favor of the i7 in that game :roll: but point is if i drop down to the 920 it becomes even more in favor of the 1090 but thats more due to the higher clock rate helping make up for lesser clock to clock performance.

but at the end of the day its what fits the bill for whats needed... with the $50 rebate + bing cash back tiger direct had awhile ago a 1090T could be grabbed for $225 which is a damn good deal but if i had to go intel now id go 1156 due to the fact the 860 tends to perform a tad bit better then the 920 the mobo is cheaper and dual channel ddr3 is more then enough and it will still hang with the 1090T or any amd cpu as we all know

and yes same if bulldozer is revolutionary and performs great ill switch out as well but im holding my breathe ;) call me a skeptic
Posted on Reply
#6
pantherx12
I'm holding onto my current system until bulldozer as well.

Xeon just about keeps me satisfied, and the 5770 is okay.... for now....
The newest fanciest games I get between 22-35 fps D:
Posted on Reply
#7
largon
Originally Posted in 1st post
Outside the modules
At the chip-level, there's a large L3 cache, a northbridge that integrates the PCI-Express root complex, and an integrated memory controller. Since the northbridge is completely on the chip, the processor does not need to deal with the rest of the system with a HyperTransport link. It connects to the chipset (which is now relegated to a southbridge, much like Intel's Ibex Peak), using A-Link Express, which like DMI, is essentially a PCI-Express link. It is important to note that all modules and extra-modular components are present on the same piece of silicon die. Because of this design change, Bulldozer processors will come in totally new packages that are not backwards compatible with older AMD sockets such as AM3 or AM2(+).
Where did this text come from? Is it a quote from a slide or a text released by AMD? The bolded, highlighted parts are, controversial, to say the least. I see nothing in the slides on any site that would state there is a on-die PCIe ctrl, DMI or even a new socket.
Posted on Reply
#8
TheMailMan78
Big Member
by: pantherx12
I'm holding onto my current system until bulldozer as well.

Xeon just about keeps me satisfied, and the 5770 is okay.... for now....
The newest fanciest games I get between 22-35 fps D:
5770=weak. Upgrade that bitch.
Posted on Reply
#9
pantherx12
by: TheMailMan78
5770=weak. Upgrade that bitch.
No monies right now else I would!

Hopefully getting a job soon at CEX ( Computer shop! WOOOO) so expect my rig to have lots of upgrades if I get the job XD
Posted on Reply
#10
Wile E
Power User
by: crazyeyesreaper
actually $300 cpu does beat a $1000 cpu on occassion granted its OC vs stock but the major point here is your on an enthusiast site we dont by parts of ridiculous high end spectrums all the time many buy the best bang for buck and oc the shit out of it. example why would you buy a 980x if a 920oced gives u the same performance for 700 less granted that 980 when overclocked will walk away yes but at $700 premium does it make sense to get it no it dosent not for most.

its why AMD is still around today they offer good enough and close enough at a lower price they dont beat intels counterparts very often but they put up a damn good show of it and a 1090T at 4ghz+ is a damn good chip for $300.

And sure we remember the athlon FX but it was out in a time when intel was still shitting out P4s and AMD had the performance crown. having the crown means you can charge more money for your shit especially if you have good marketing :toast:
And if AMD takes the lead again, do you really expect them to give you $300 chips that outperform Intel's $1000 chips? No, they won't. The point he is making is that if they can compete in the high end, they will charge $1000 for those cpus, just like Intel. They've already demonstrated this in the past with the FX series. You will no longer have $300 kick ass chips.

And the $700 was worth every penny to me. ;)

by: crazyeyesreaper
well thats my point most apps still dont use more then 2-4 threads Intel has better clock to clock performance its cant be denied but major selling point for amd is the fact a 1090T can be dropped into a previous gen cfx or sli mobo that costs $100 where as that i7 930 is forced to be paired with a x58 and im just going with new prices not what we can find used that and triple channel is higher priced and as the 1156 socket proved triple channel is uneeded for i7. as the 750 and 860 tend to run neck and neck with the 920 in alot of situations and surprisingly the 750 tends to do slightly better in games with the same gpu on occasion as well which is rather interesting

http://www.anandtech.com/bench/Product/146?vs=46

as shown here 1090T vs i7 940 both trade blows off and on and the i7s only dominace comes in Far Cry 2 and let me tell you its a LANDSLIDE in favor of the i7 in that game :roll: but point is if i drop down to the 920 it becomes even more in favor of the 1090 but thats more due to the higher clock rate helping make up for lesser clock to clock performance.

but at the end of the day its what fits the bill for whats needed... with the $50 rebate + bing cash back tiger direct had awhile ago a 1090T could be grabbed for $225 which is a damn good deal but if i had to go intel now id go 1156 due to the fact the 860 tends to perform a tad bit better then the 920 the mobo is cheaper and dual channel ddr3 is more then enough and it will still hang with the 1090T or any amd cpu as we all know

and yes same if bulldozer is revolutionary and performs great ill switch out as well but im holding my breathe ;) call me a skeptic
I still fail to understand why people use games to test cpu performance. 4 year old cpus still game just fine. It's kind of a pointless test for cpu power.

But, to be honest, I was looking forward to Thuban at first, then I found out it only matches i7 quads clock for clock in the stuff I do. That's when I skipped on a gfx upgrade, and went with the 980X instead. 4870X2 is still plenty for most games, but my QX wasn't doing the trick for me anymore.

I really hope Bulldozer lives up to expectations tho. Competition at the high end will do us some justice. I'll sell my rig and go AMD if they can pull ahead.
Posted on Reply
#11
Bloodcrazz
by: crazyeyesreaper
im talking about the guy who bitched for nearly half the thread about 890fx... when if he used his brain he would realize that when bulldozer does come out 890fx will be nearly 2 years old
err 890fx came out few months ago (q2/2010) bulldozer is coming first half 2011 im no math king but i'm sure thats 2 years
ROFL
Posted on Reply
#12
crazyeyesreaper
Chief Broken Rig
bulldozer probably wont hit maintstream aka non server markets untill the end of 2011 after all the leaked benchs above are most likely server chips since thats what there comparing performance wise now im no genius but there were 12 core amd server cpus long before a 6core ever came to the desktop market. by the time Bulldozer comes to the market in force it will be over a year. Still so ppl can still bitch moan and complain if u dont like it to bad those companies still dont give a shit also 32nm bulk has been skipped and were going to 28nm which still isnt in full swing and wont be for some time

then lets not forget the switch to global foundries etc whos to say they wont hit a snag fact is by the time bulldozer is in full swing it will be 2012 in my honest opinion and in that situation its still nearly 2 years for 890fx and honestly im not butt hurt i didnt by an 890fx i paid $110 for a 790fx u get what u pay for and in the tech world u pay the price for the lastest and greatest
Posted on Reply
#13
Bloodcrazz
by: crazyeyesreaper
bulldozer probably wont hit maintstream aka non server markets untill the end of 2011 after all the leaked benchs above are most likely server chips since thats what there comparing performance wise now im no genius but there were 12 core amd server cpus long before a 6core ever came to the desktop market. by the time Bulldozer comes to the market in force it will be over a year. Still so ppl can still bitch moan and complain if u dont like it to bad those companies still dont give a shit also 32nm bulk has been skipped and were going to 28nm which still isnt in full swing and wont be for some time

then lets not forget the switch to global foundries etc whos to say they wont hit a snag fact is by the time bulldozer is in full swing it will be 2012 in my honest opinion and in that situation its still nearly 2 years for 890fx and honestly im not butt hurt i didnt by an 890fx i paid $110 for a 790fx u get what u pay for and in the tech world u pay the price for the lastest and greatest
where are you reading this. zambezi will be 32nm.
its ati that skipped 32nm going to 28nm.
Posted on Reply
#14
Super XP
by: Bloodcrazz
where are you reading this. zambezi will be 32nm.
its ati that skipped 32nm going to 28nm.
Sounds like an old forum post with misinformation. I remember reading this somewhere way back.
Posted on Reply
#15
cadaveca
My name is Dave
by: Super XP
Sounds like an old forum post with misinformation. I remember reading this somewhere way back.
Nah, that's pretty accurate. TSMC, that makes ATI chips, is skipping 32nm, and that's official, so only GLoFOr or maybe Chartered, could produce 32nm chips for ATi(unsure of Chartered's capacity or even ability to do 32nm).

While I'd personally love to see TSMC lose ATi's business, I doubt GloFo actually has the capacity to produce 32nm vga chips, without affecting cpu or chipset outputs.
Posted on Reply
#16
nt300
by: cadaveca
Nah, that's pretty accurate. TSMC, that makes ATI chips, is skipping 32nm, and that's official, so only GLoFOr or maybe Chartered, could produce 32nm chips for ATi(unsure of Chartered's capacity or even ability to do 32nm).

While I'd personally love to see TSMC lose ATi's business, I doubt GloFo actually has the capacity to produce 32nm vga chips, without affecting cpu or chipset outputs.
I read GlobalFo also cancel 32nm and already move to 22nm & 20nm.
And heres the link.
http://www.xbitlabs.com/news/other/display/20100401144643_Globalfoundries_Scraps_32nm_Bulk_Fabrication_Process.html
Posted on Reply
#17
cadaveca
My name is Dave
by: nt300
I read GlobalFo also cancel 32nm and already move to 22nm & 20nm.
And heres the link.
http://www.xbitlabs.com/news/other/display/20100401144643_Globalfoundries_Scraps_32nm_Bulk_Fabrication_Process.html
Thanks, dude, I do remeber this, but I'll point out something:
All of our efforts around next-gen graphics and wireless are focused on 28nm with HKMG and we no longer have a 32nm bulk process.
Notice no mention of cpu processes. They are different products.
Posted on Reply
#18
inferKNOX
by: Bloodcrazz
http://www.tomshardware.com/reviews/bulldozer-bobcat-hot-chips,2724-2.html
Scorpius: Enthusiast desktop platform based on AMD’s Zambezi processor and discrete graphics (AMD, of course, specifies an ATI GPU). The platform requires a quad-core CPU or higher, DDR3 memory, and a revised Socket AM3 interface. Availability is expected in 2011.
toms wrong 2
lol everyone turn on panic rofl but panic was the only one that was right
by: cadaveca
:shadedshu

There is so much conflicting info out there...makes you wonder...:rolleyes:
That is just the same word for word statement as in post #139 which I commented on in post #143, nothing new.
It's a simple case of copy-paste, which doesn't make it any more right or wrong than the rest, nor adds or subtracts from the level of misinformation out there.:p
@Bloodcrazz: trolling is never right, even if he was trying to prove some point.

@Everyone Else: AMD sure is taking long to announce the HD6000s officially if they're coming out as soon as Oct/Nov '10; do you think they're trying to squeeze the announcement as close as they can to the Bulldozer for the sake of the Scorpius platform?
Posted on Reply
#19
a_ump
idk why but i have a feeling that AMD/ATi are going to blow us the f*ck! away. Idk if it'll be purely in gpu specs, but i've noticed alot of HD 5XXX series issues that have been on-going for sometime esp xfire.

I know ATi have that rep but they did good with HD 4XXX series drivers, so makes me think a good portion of their time is spent on the HD 6XXX series; like mad optimizations, improved loaders for shaders, more accurate CCC overdrive with voltage control, at least 85% xfire scaling. I do look for some of all i've said to happen but i def don't think it all will; that would be too perfect n makes nvidia shit themselves lol
Posted on Reply
#20
inferKNOX
Yeah, that would be nice.
Just as I spoke though: ATI Radeon HD 6000 Series GPU Codenames Surface
Still unofficially however I take it... :-/
TIME FOR SOME MARKETING AMD! That's where Intel and nVidia trump AMD/ATi almost everytime, and I say almost, not because there's been a time I've seen otherwise, but I just assume they must have marketed better at least once!
Posted on Reply
#21
wahdangun
by: CDdude55
But the whole point of Physx is so the GPU does the physics processing instead of the CPU. whether or not a CPU can utilize all of it's cores is a matter of software taking advantage of those cores. How is physx holding the CPU back?, even if physx is poorly coded, how would that effect the CPU?. I don't understand how in anyway physx could be holding back a part that it has nothing to do with.

But of course, everything has to be Nvidia's fault right.:shadedshu
the whole point of physix is so we can have more realistic effect and btw its really take some processing power of GPU to process physix effect and the result are we must tone down our in game setting to lower like lower AA/detail and make your hexacore cpu useless (because no games that can use more than 4 thread) so if physix was coded properly we doesn't need to use our precious GPU power to process physix and make the cpu more useful,

so nvdia make it like havoc but optimize it further by supporting more than quad core,
Posted on Reply
#22
pantherx12
Actually the latest phsyx supports multicore rendering by default, as many cores as you have.

It still uses old instruction sets mind you.

an 8600gt for example has twice the performance of my CPU at the moment.

I'd expect my CPU to be as good as 8600gt if phsyx used newer instruction sets for the cpu.
Posted on Reply
#23
TheMailMan78
Big Member
The fact these new CPU's will use AM3 sockets makes me a little sad. You cannot push the envelope and maintain current compatibility. Especially on a 2 year old socket.
Posted on Reply
#24
nt300
by: cadaveca
Thanks, dude, I do remeber this, but I'll point out something:

Notice no mention of cpu processes. They are different products.
Yes good point, thanks :)
Posted on Reply
#25
cadaveca
My name is Dave
by: nt300
Yes good point, thanks :)
I think the real point there is that once again, AMD isn't exactly forthcoming with PRECISE information, ever. Or maybe it's those reporting...I am unsure since everyone in those circles is so "buddy-buddy" at this point.
Posted on Reply
Add your own comment