Friday, September 24th 2010

AMD Orochi ''Bulldozer'' Die Holds 16 MB Cache

Documents related to the "Orochi" 8-core processor by AMD based on its next-generation Bulldozer architecture reveal its cache hierarchy that comes as a bit of a surprise. Earlier this month, at a GlobalFoundries hosted conference, AMD displayed the first die-shot of the Orochi die, which legibly showed key features including the four Bulldozer modules which hold two cores each, and large L2 caches. In coarse visual inspection, the L2 cache of each module seems to cover 35% of its area. L3 cache is located along the center of the die. The documents seen by X-bit Labs reveal that each Bulldozer module has its own 2 MB L2 cache shared between two cores, and an L3 cache shared between all four modules (8 cores) of 8 MB.

This takes the total cache count of Orochi all the way up to 16 MB. This hierarchy suggests that AMD wants to give individual cores access to a large amount of faster cache (that's a whopping 2048 KB compared to 512 KB per core on Phenom, and 256 KB per core on Core i7), which facilitates faster inter-core, intra-module communication. Inter-module communication is enhanced by the 8 MB L3 cache. Compared to the current "Istanbul" six-core K10-based die, that's a 77% increase in cache amount for a 33% core count increase, 300% increase in L2 cache per core. Orochi is built on a 32 nm GlobalFoundries process, it is sure to have a very high transistor count.Source: Xbit Labs
Add your own comment

152 Comments on AMD Orochi ''Bulldozer'' Die Holds 16 MB Cache

#1
bear jesus
largon said:
Interesting, the fact you commented...
:D

Now that I took another look on the released (heavily pixelated & manipulated) die shot, it might just be Bulldozer's L3 is either just 6MB, or whopping 12MB. That, or the cells in L3 are actually less dense than those in L2.
The one thing i keep thinking about is that i remember articles saying that the cores got photoshopped to hide some things so would not the die shots not be that informative right now even for speculation?
Posted on Reply
#2
de.das.dude
Pro Indian Modder
Gooooooooooooooooooooooooooo amd!!
Posted on Reply
#3
cheezburger
largon said:
Oh boy... Here we go.

As if they would have had any problems slapping in an equally sized or larger than Hammer's L1s... It's not like this is AMD's first CPU architecture ever, or that adding such and amount would be of any die area concern. And for comparison, Nehalem has 32kB per core, 16kB per thread AND a tiny 256kB L2 - I bet Intel must be struggling with similar performance hit.
Err... No.
Each Bulldozer module has two set of integer pipelines and both of them have dedicated 16kB L1D. 16+16kB in total per module, 16kB per thread.
Bulldozer's L1I is 64kB, that's been public for some time now. About the bracketed comment; you think it could have been smaller, or you aren't sure what size it is?
If you say so...

And by coincidence, Intel is doing the same. "Obviously" they too must be patching Core m-arch's "poor L1s and L2s" by adding cache levels and continuously increasing their size.

No. Orochi is 4 module, 8 thread core.

Durrr...
Bulldozer does not have a 16MB L3, even reading the thread title should give away the L3 is 8MB. 2MB L2 + 2MB L3 per module, that is. Thus, per module, Orochi has 8× as much L2 vs. Nehalem and equal L3-ratio.

Strange conclusion considering the public, (that includes me and you) don't know Bulldozer's exact pipeline length, yet.

Broken sentence. What are you trying to say?
You do believe it is 20+ stage or you do not?
Also, the clock rates are completely unknown to public.
Oh really? Now one can only wonder why didn't Intel see such a shortcoming of their L2 before taping out Nehalem, Sandy Bridge... They must have missed the fact their chips' L2 had shrinked to a fraction of the size compared to Conroe, Penryn.

PS.
In case you find some parts of my reply sarcastic, it is highly likely you are right.

Abstract for those with the "TL;DR" -syndrome:
Burger, please get your facts straight. The factual errors I've pointed out are public knowledge, go read them. And please do pay attention to writing proper English, often it is impossible to figure out what you're trying to say as many of your sentences are missing words and the words that are there are often misspelled.
that...is the most aggressive post i ever read...

first off intel is not 16kb per thread as you may think, largely a core that can do 2 thread is not necessary divide l1 cache in half as NOT all nehalem processor like i7 that came with hyperthreading...and pretty much you have no idea/no understanding about hyperthreading...hyperthread is pipeline measuring, when hyperthreading enable it will use the unused part of clock cycle/pipeline and simulate a "fake" core during process. which is technically still 32kb l1 data per thread.

the hard fact is bulldozer cores are NOT divide from module, they are the individual core that pair of core are wrap together into each module with a l1 instruction cache in the middle and wiring between two of core. so the instruction cache is uncored for sure(while nehalem's l1/l2 are bulit in each of their core and only left a larger l3 cache outside the core with ringbus connected.) why i said it was 8kb, because it was rumor to be between 8~16kb in early 2009...since most said that it will use far smaller cache than it was on k10 (128kb!!) some site took smaller number but wth? it's still in speculation period and who knows amd might increase their cache to 32k or even 64k l1 data per core? plus even under 16kb, with running 2 threads it will divide cache into 8kbx2 because they don't have hyperthreading like intel had that optimize single core in multi-threading without drop too much of performance...

now before you start hammer me with your ignorance...you have to understand one thing:

UNCORE MEANS ANYTHING THAT IS NOT BUILD INSIDE THE CORE! even they are still in the same die/module it doesn't change the fact these cache are uncore....which i'm not wrong at this point! it makes it look like each core only had 8`16k l1 data while no L1I built in!

before calling me troll you better measure how much you know about miroarchitecture first....



largon said:
You're misinterpreting me. My "2MB L3 per module" is only a way to state a ratio, not actual configuration.


What?
That's just not true. Bulldozer's L1I and L2 are fully integrated parts of the BD module and they run at core freq, and no less.
Bulldozer has 4T L1 latency, same as Nehalem's.
Especially if the one "imagining things" is using incorrect numbers...
What can I say, once again you astound (but not surprise) by posting utter nonsense.
Feeling particularly "blue", perhaps? And by saying that I'm not referring to mood.

But what can you do, a troll is a troll is a troll.
bulldozer has same latency as nehalem? that's news to me...no! amd has long history of bad performance on their cache because of low quality silicon yield during production and bad wiring in die area. what do you think why amd bother go 128k l1 cache design if their cache were so powerful? just so you know in phenom the l1 cache latency can be as high as 10~12 clock per cycle while core 2 is only 2.5clock per cycle. that just not for long ago and what makes you think they are be tweak over night? and with such small cache? it sounded more ridiculous than cayman that only has 32 rops... larger + faster l1 cache means better performance. neither amd will get better result if they keep such garbage design on bulldozer. all they need is just put 128k l1 data in each core and bulldozer will trump nehalem for sure. in IT field TDP means shit! performance is ALWAYS measure by how big die size and how many transistor.it will be stupid because smaller l1 cache = performance loss. also they still had alot of room to put cache in their core because their new core are only 1/2~1/3 of nehalem's single core. if they want to win this they better increase their die size by adding more cache lik 256kb l1D per core +128kL1I, 4mb l2 per module and 32mb share l3 cache and feature 400mm^2


oh about pipeline, you can cehck this:

http://www.amdzone.com/phpbb3/viewtopic.php?f=52&t=137246&sid=1b04a4780b037a6ab2c09efd2ffe3f19&start=350
it was confirm that they are under work on more pipleline and feature higher frequency..don't know if it's true but that will be stupid too

also the l1 cache has confirmed to be 4 cycle rather than previously thought of 3 cycle(worse than p3...). i mean seriously if they can't outperform intel why dont they just increase more cache instead? if 128k can't beat intel's 64k then why not 256 or even 512k on l1 cache?
Posted on Reply
#4
JF-AMD
AMD Rep (Server)
bear jesus said:
The one thing i keep thinking about is that i remember articles saying that the cores got photoshopped to hide some things so would not the die shots not be that informative right now even for speculation?
Everything in the processor was photoshopped, not just the cores.
Posted on Reply
#5
bear jesus
JF-AMD said:
Everything in the processor was photoshopped, not just the cores.
Thank you for confirming that, i was little confused as to if it was cirtain area's or more. But this just makes me wonder how accurate anyone's speculations on any part of the cpu could be.
Posted on Reply
#6
Techtu
cheezburger said:
that...is the most aggressive post i ever read...

first off intel is not 16kb per thread as you may think, largely a core that can do 2 thread is not necessary divide l1 cache in half as NOT all nehalem processor like i7 that came with hyperthreading...and pretty much you have no idea/no understanding about hyperthreading...hyperthread is pipeline measuring, when hyperthreading enable it will use the unused part of clock cycle/pipeline and simulate a "fake" core during process. which is technically still 32kb l1 data per thread.

the hard fact is bulldozer cores are NOT divide from module, they are the individual core that pair of core are wrap together into each module with a l1 instruction cache in the middle and wiring between two of core. so the instruction cache is uncored for sure(while nehalem's l1/l2 are bulit in each of their core and only left a larger l3 cache outside the core with ringbus connected.) why i said it was 8kb, because it was rumor to be between 8~16kb in early 2009...since most said that it will use far smaller cache than it was on k10 (128kb!!) some site took smaller number but wth? it's still in speculation period and who knows amd might increase their cache to 32k or even 64k l1 data per core? plus even under 16kb, with running 2 threads it will divide cache into 8kbx2 because they don't have hyperthreading like intel had that optimize single core in multi-threading without drop too much of performance...

now before you start hammer me with your ignorance...you have to understand one thing:

UNCORE MEANS ANYTHING THAT IS NOT BUILD INSIDE THE CORE! even they are still in the same die/module it doesn't change the fact these cache are uncore....which i'm not wrong at this point! it makes it look like each core only had 8`16k l1 data while no L1I built in!

before calling me troll you better measure how much you know about miroarchitecture first....





bulldozer has same latency as nehalem? that's news to me...no! amd has long history of bad performance on their cache because of low quality silicon yield during production and bad wiring in die area. what do you think why amd bother go 128k l1 cache design if their cache were so powerful? just so you know in phenom the l1 cache latency can be as high as 10~12 clock per cycle while core 2 is only 2.5clock per cycle. that just not for long ago and what makes you think they are be tweak over night? and with such small cache? it sounded more ridiculous than cayman that only has 32 rops... larger + faster l1 cache means better performance. neither amd will get better result if they keep such garbage design on bulldozer. all they need is just put 128k l1 data in each core and bulldozer will trump nehalem for sure. in IT field TDP means shit! performance is ALWAYS measure by how big die size and how many transistor.it will be stupid because smaller l1 cache = performance loss. also they still had alot of room to put cache in their core because their new core are only 1/2~1/3 of nehalem's single core. if they want to win this they better increase their die size by adding more cache lik 256kb l1D per core +128kL1I, 4mb l2 per module and 32mb share l3 cache and feature 400mm^2


oh about pipeline, you can cehck this:

http://www.amdzone.com/phpbb3/viewtopic.php?f=52&t=137246&sid=1b04a4780b037a6ab2c09efd2ffe3f19&start=350
it was confirm that they are under work on more pipleline and feature higher frequency..don't know if it's true but that will be stupid too

also the l1 cache has confirmed to be 3 cycle rather than previously thought of 2 cycle. i mean seriously if they can't outperform intel why dont they just increase more cache instead? if 128k can't beat intel's 64k then why not 256 or even 512k on l1 cache?

I'm no expert but are you saying a higher L1 cache is better... I'm sure that isn't the case.

doesn't it go a little something like this...

Smaller L1 cache has better performance over a larger L1 cache.

Larger L2 cache has better performance over a smaller L2 cache.

Same rule as the L2 cache for L3 cache I believe.

EDIT: ignore the above, I realised my mistake shortly after posting.
Posted on Reply
#7
DigitalUK
JF-AMD said:
Everything in the processor was photoshopped, not just the cores.
lol nice one, JF-AMD is there anything you can tell us that is real about bulldozer that hasnt been speculated on in this thread. obviously without getting yourself in trouble. drooling..:respect:
Posted on Reply
#8
JF-AMD
AMD Rep (Server)
cheezburger said:
that...is the most aggressive post i ever read...

first off....
I don't even want to try to pick your response apart to pull out the things that you stated (almost as fact) that were a.) pure speculation on the web and b.) wrong.

I would say that you have some things right, but more wrong and your basic understanding of how the processor architecture is built/interacts is off.

I recommend reading the bulldozer blogs, and most importantly the questions at the end (each blog has about 30+ comments where I am answering questions.)

That is probably the best place to start if you truly want to understand the architecture. But, first and foremest, you have to put aside some of the traditional understandings of how things work in semiconductor world and take into consideration that this is a "new dawn" for processor design and technology. You can't apply old assumptions to a new architecture.
Posted on Reply
#9
JF-AMD
AMD Rep (Server)
DigitalUK said:
lol nice one, JF-AMD is there anything you can tell us that is real about bulldozer that hasnt been speculated on in this thread. obviously without getting yourself in trouble. drooling..:respect:
Read the bulldozer blogs, especially the "20 questions" blogs. And especially the comments after each blog. I give a LOT of detail there. That should cover everything that is fit to print.
Posted on Reply
#10
cheezburger
JF-AMD said:
I don't even want to try to pick your response apart to pull out the things that you stated (almost as fact) that were a.) pure speculation on the web and b.) wrong.

I would say that you have some things right, but more wrong and your basic understanding of how the processor architecture is built/interacts is off.

I recommend reading the bulldozer blogs, and most importantly the questions at the end (each blog has about 30+ comments where I am answering questions.)

That is probably the best place to start if you truly want to understand the architecture. But, first and foremest, you have to put aside some of the traditional understandings of how things work in semiconductor world and take into consideration that this is a "new dawn" for processor design and technology. You can't apply old assumptions to a new architecture.
this is what i got from you..
We get asked that a lot. The key is that a single core that would be able to compete with the throughput of two smaller cores would consume a disproportionate amount of die space and consume more power. Taking Bulldozer and turning each module into one “big core” instead of two cores with some shared resources would net you a disproportionately higher price and disproportionately higher power consumption.

In reality what we are doing is driving efficiency. And don’t worry about the single threaded performance –we have already stated publicly that Bulldozer single threaded performance is expected to be higher than our current core architectures.

so make the point short, you guys make such small individual core inside each module was because of production cost and power consumption? result: amd giving up on high end user...same thing i will be disappoint on hd 6000 if it turns out to be a "mainstream card". along side with"mainstream processor" like this...if such mainstream design can outperform nehalem then i'll never come back to this forum as result of punishment...

however small core strategy is completely wrong. and how much efficiency you can get from a smaller core?
Posted on Reply
#11
CDdude55
Crazy 4 TPU!!!
JF-AMD said:
Read the bulldozer blogs, especially the "20 questions" blogs. And especially the comments after each blog. I give a LOT of detail there. That should cover everything that is fit to print.
Very nice blog, definitely gonna give it a read.:)
Posted on Reply
#12
cadaveca
My name is Dave
cheezburger said:
so make the point short, you guys make such small individual core inside each module was because of production cost and power consumption? result: amd giving up on high end user...same thing i will be disappoint on hd 6000 if it turns out to be a "mainstream card". along side with"mainstream processor" like this...if such mainstream design can outperform nehalem then i'll never come back to this forum as result of punishment...
I think the biggest point is that without an actual Bulldozer-based core on your hands, and a supporting motherboard, any claims of greatness, and likewise, any worries, can only be speculation.

I've got a big bone to pick with AMD. A BIG ONE.


But the fact that JF is here says that they really aren't forgetting about the high-end user. Or else he wouldn't be posting.

Like really, many years ago, when the ATI/AMD merger was fresh, AMD quite publically stated that they were not going to focus on the enthusiast market, and they didn't

To me, the TWKR chips was AMD waving a big flag..."OK, you guys...were coming back."( Of course, they never sent me one, so that was a wasted marketing effort, barely see anyone using them, either....)

Since then, Thuban came out. Who do you think those cpus are really targeting?

Give them some time...If I can deal with the crappy vgas and cpu I got now, until new stuff is released, so can anyone else.


CDdude55 said:
Very nice blog, definitely gonna give it a read.:)
:laugh:...I told ya there was some good, official, info on the AMD blog site...
Posted on Reply
#13
DigitalUK
Thanks JF-AMD some really good information there, going to take me all evening to go through all that but thanks. its nice to have some real information from the source.
Posted on Reply
#14
bear jesus
cadaveca said:
If I can deal with the crappy vgas and cpu I got now, until new stuff is released, so can anyone else.
:laugh::roll::laugh:

really? "Phenom II 965BE @ 3.6ghz/2400NB 300HTT and XFX HD5870 XXX x2 @ 900/1250" is crappy to you? :p

I am still on an old am2+ board with ddr2, and a single 4870..... i'm kind of offended now :p
Posted on Reply
#15
cadaveca
My name is Dave
:laugh:, well, I'm running triple monitors in Eyefinity. I am unable to get what I consider an acceptable level of performance...I have to sacrifice image quality to do so. I'm sure you've seen at least one of my many posts complaining about my system...it's really unbalananced, in the least...

That 965BE is probably the worst Phenom2 I've had. I cannot get more than that 3.6ghz stable, due to heat, and when I cool it better, it maxes out @ 3.8ghz. As it is, 3.6ghz requires 1.425, under my current cooling.
Most 965 will do 3.8ghz with far less than my 3.6ghz.

My vgas overheat, my second card is useless except in Eyefinity, and then it only gives a very slight performance boost...like 10%. I spent $450, to get a $45 performance boost.

I'm sure if you were dealing with the issues I am, you would feel the same way. I was much happier with my 955 and 4890's...that seemed like a very well-balanced system, performance-wise.


So, a big part of the performance, or lack thereof, seems to be cpu. So, of course, I'm paying close attention to Bulldozer...I'll probably buy on release.

I guess I'm still an AMD fanboy...I just cannot take my butt to the store, and buy i7. Might fix things for me, but I just can't do it.:shadedshu

:roll:
Posted on Reply
#16
bear jesus
cadaveca said:
:laugh:, well, I'm running triple monitors in Eyefinity. I am unable to get what I consider an acceptable level of performance...I have to sacrifice image quality to do so. I'm sure you've seen at least one of my many posts complaining about my system...it's really unbalananced, in the least...

That 965BE is probably the worst PHenom2 I've had. I cannot get more than that 3.6ghz stable, due to heat, and when I cool it better, it maxes out @ 3.8ghz. As it is, 3.6ghz requires 1.425, under my current cooling.

My vgas overheat, my second card is useless except in Eyefinity, and then it only gives a very slight performance boost...like 10%. I spent $450, to get a $45 performance boost.
Ok yes i admit in your situation you are right, and it does kind of suck so i can understand your point now.
Admitdly i'm just running a single 1680x1050 monitor and i was lucky to get a phenom II 965 that will get above 4ghz with enough voltage. apart from a new gpu in a few months i'm more than happy to wait with what i have untill bulldozer so i can make my choice of new cpu, motherboard and ram.

*edit* i wish you had got a chip that could get up to about 4ghz or more (i kept trying to get 4.2ghz stable but mine is just not happy with it) as i think you would have been muc happyer with it.
Posted on Reply
#17
cadaveca
My name is Dave
Well, you know, I bought the very best AMD had to offer at the time. And I went from a QX9650...down to E8400...then 720BE...and all through that, I was pretty happy with AMD. 720 BE wasn't as fast as E8400 @ 4.0ghz, but it seemed far smoother.

I recently sold a 955 to another member here...it did 4ghz...unfortunately, it wasn't enough, so I moved on to the 965. Of course, the 965 sucked...

I've got a 1090T and 2x GTX480 here too...waiting for Crosshair 4 extreme to build it up. that cpu might do 4ghz...I'm tempted to rip the box open...but maybe I'll just go get another today.

I have to look towards bulldozer. Although i7 might offer a wee bit more performance than I have now, my "high-end" monitor setup seems to require more than i7 can give, even. I need 4.6ghz or so, I think...


Everything about Bulldozer sounds good. In my testing with 965, it seems a big part of my performance issues is not really cpu speed...but the memory controller speed. And that is looking to get a big boost in performance...

AMD has been pretty open with what is physically going to change, but in such a way that the performance that those changes offer doesn't exactly become evident. As JF said, this really seems to be a fundamental shift in cpu design, and I am really hoping that it all falls in line together.
Posted on Reply
#18
bear jesus
cadaveca said:

AMD has been pretty open with what is physically going to change, but in such a way that the performance that those changes offer doesn't exactly become evident. As JF said, this really seems to be a fundamental shift in cpu design, and I am really hoping that it all falls in line together.
I admit that is one of the reasons i am so willing to wait, i feel there is no point in me going with an i7 with both intel's and amd's nex gen coming out relativly soon (considering that i have had most of my setup for several years), i have a lot of hope that bulldozer will be worth the wait.... and if not intel will get my business for the first time in years :roll: although i doubt that as at least for me amd has been giving me the best value for money for years and i dont expect that to change.

Also on another note, i have been reading through JF's blog and have been getting a little too excited from the things i read, i have it bookmarked and will be popping back there very often. I suggest anyone interested in buldozer takes a loook.
Posted on Reply
#19
cheezburger
Tech2 said:
I'm no expert but are you saying a higher L1 cache is better... I'm sure that isn't the case.

doesn't it go a little something like this...

Smaller L1 cache has better performance over a larger L1 cache.

Larger L2 cache has better performance over a smaller L2 cache.

Same rule as the L2 cache for L3 cache I believe.
what do you think intel/ibm want to go 64k on their l1 cache while they can use the exist 32k on both p6/g4 architecture?? why doubles it? and why amd made such big jump and made world largest l1 cache of 128k when back to k7 era? because L1 is important...if not then why everyone try to enlarge the size? smaller l1 cache can outperform larger l1 cache? yes but depends on clock cycle, latency/missing rate and quality of silicon yield. currently peyrin/wolfdale/nehalem have far lower latency/missing rate and clock cycle than bulldozer's design while it's more than 3 time larger. bulldozer's l1 cache is revealed to be high latency (on amd's exist cache architecture a phenom's 128k l1 cache will be 10~12/cycle 4~5 /cycle on 16kb and 3~4/cycle on 8kb) if they hope it will drop the latency by just reduce the l1 cache will be a stupid idea. of cause unless they want to clock up really high to fix that missing penalty. but it will end up to be another netburst...


cadaveca said:
I think the biggest point is that without an actual Bulldozer-based core on your hands, and a supporting motherboard, any claims of greatness, and likewise, any worries, can only be speculation.

I've got a big bone to pick with AMD. A BIG ONE.


But the fact that JF is here says that they really aren't forgetting about the high-end user. Or else he wouldn't be posting.

Like really, many years ago, when the ATI/AMD merger was fresh, AMD quite publically stated that they were not going to focus on the enthusiast market, and they didn't

To me, the TWKR chips was AMD waving a big flag..."OK, you guys...were coming back."( Of course, they never sent me one, so that was a wasted marketing effort, barely see anyone using them, either....)

Since then, Thuban came out. Who do you think those cpus are really targeting?

Give them some time...If I can deal with the crappy vgas and cpu I got now, until new stuff is released, so can anyone else.




:laugh:...I told ya there was some good, official, info on the AMD blog site...
i think biggest reason is amd actually thought they can tweak performance on cpu just like what they did on gpu....by adding more smaller core on high end product with relative low price..however cpu is nothing like gpu that you can add as much shader to increase performance on r770. but it seem likely bulldozer is a cpu version of r770.......

it was first romor(again rumor rumor rumor!!! don't frame me plz!) that amd going to make the core so small that only 2kb l1 cache per core and use SIMD unit to "chain" core each core, each module will act as SIMD unit and 2 modules will form a SIMD cluster and l1 instruction cache is share by all simd cluster. while the original oroshi will have 8 SIMD cluster(again it was rumor so don't bash or attack me....please!) and each 2 SIMD cluster will share a 256kb l2 cache and all SIMD cluster will share 8mb L3 cache...it was like that in early debut. orochi would end up to be a world first 32 core processor if the concept like this...... if that design was adopted it would be terrible to imagine how bad the single thread performance would be..as that 32 core orochi in fiction timelime was target on quad core nehalem/sandies/ivy

amd did promise they won't go high end design...and they did fulfill that promise completely...hd 5xxx was suppose to be mainstream but due to the fail of fermi which end up become high end line...which it was coincidence...phenom 1/2 was supppose to be mainstream and they did play well on their role...they really need a bigger core design if they want to outpace intel and nv..
Posted on Reply
#20
JF-AMD
AMD Rep (Server)
OK, let me see if I have all of the facts straight.

You believe that L1 is the make or break feature in an architecture?
You believe the L1 is outside of the core, yet it is inside the core?
You are posting based on rumors and when confronted with the right information from a legitimate source you don't admit you're wrong?
You have ~60 posts, 10 of which are about bulldozer, all are critical of it?
The rest of the posts do a pretty good job of picking on AMD from a graphics front.

OK, I got it now.
Posted on Reply
#21
cheezburger
JF-AMD said:
OK, let me see if I have all of the facts straight.

You believe that L1 is the make or break feature in an architecture?
You believe the L1 is outside of the core, yet it is inside the core?
You are posting based on rumors and when confronted with the right information from a legitimate source you don't admit you're wrong?
You have ~60 posts, 10 of which are about bulldozer, all are critical of it?
The rest of the posts do a pretty good job of picking on AMD from a graphics front.

OK, I got it now.
i don't use romur as fact but clearly that 16kb l1 data under current design is just too small for today standard. also i did not criticize anything from amd's product line. it just a bit personal complain on amd's current market strategy which only focus on mainstream line. but you know something? amd needs a top end killer cpu/gpu to crush its opponent. since you work in amd you should realize amd had been underdog for 4 straight years and you already know the reason why....

also the l1 instruction cache is 32kb if consider 2 core share it.....
Posted on Reply
#22
bear jesus
cheezburger said:
i don't use romur as fact but clearly that 16kb l1 data under current design is just too small for today standard. also i did not criticize anything from amd's product line. it just a bit personal complain on amd's current market strategy which only focus on mainstream line. but you know something? amd needs a top end killer cpu/gpu to crush its opponent. since you work in amd you should realize amd had been underdog for 4 straight years and you already know the reason why....
I have to ask, you were so convinced that the new top end amd gpu would have a 512bit memory bus and would be a $600 card yet you call it mainstream, can you point me to some other mainstream cards with a 512bit bus or that costed $600?

I am pretty sure that a lot of your comments are leading people to believe that you are just trolling amd/ati threads.

Really with both bulldozer and the amd 6xxx card's no will know anything too specfic untill closer to the launch date of each.
Posted on Reply
#23
JF-AMD
AMD Rep (Server)
Since 2 cores are sharing the same instructions much of the time, it can actually be 64K, right? Instruction caches are far less random than data caches.
Posted on Reply
#24
enaher
Loving the more is better part of your blog:rockout:, gives that famous quote real men use real cores, a whole new life, great blog JF.
Posted on Reply
#25
cheezburger
bear jesus said:
I have to ask, you were so convinced that the new top end amd gpu would have a 512bit memory bus and would be a $600 card yet you call it mainstream, can you point me to some other mainstream cards with a 512bit bus or that costed $600?

I am pretty sure that a lot of your comments are leading people to believe that you are just trolling amd/ati threads.

Really with both bulldozer and the amd 6xxx card's no will know anything too specfic untill closer to the launch date of each.
do you believe there source from wiki? second there's still many people rather believe cayman will be a 256bit bus with ridiculously high frequency GDDR5 and 32rops with also ridiculous number of old style 5D shader...also there are many romur as well. first i did believe cayman is going to be a top end card with 512bit bus and 64rops..along with mighty bulldozer....but after seen specification of orochi i starting doubt amd did not change their mainstream strategy at all...which is what i afraid. bulldozer's small core just reminding me of 5D shader on old r600....small core just make bulldozer don't look that mighty anymore...

also are you sure that i only up on amd/ati thread? go see on fermi thread...i made a lot of criticism about fermi/kapler over there and i believe you should take a look.... and stop calling me troll unless you can prove me wrong but then again that is personal offense.
Posted on Reply
Add your own comment