Monday, November 22nd 2010
AMD Cayman, Antilles Specifications Surface
At last, specifications of AMD's elusive Radeon HD 6970 and Radeon HD 6990 graphics accelerators made it to the internet, with slides exposing details such as stream processor count. The Radeon HD 6970 is based on a new 40 nm GPU by AMD, codenamed "Cayman". The dual-GPU accelerator being designed using two Cayman GPUs is codenamed "Antilles", and carries the product name Radeon HD 6990.
Cayman packs 1920 stream processors, spread across 30 SIMD engines, indicating the 4D stream processor architecture, generating single-precision computational power of 3 TFLOPs. It packs 96 TMUs, 128 Z/Stencil ROPs, and 32 color ROPs. Its memory bandwidth of 160 GB/s indicates that it uses a 256-bit wide GDDR5 memory interface. The memory amount, however, seems to have been doubled to 2 GB on the Radeon HD 6970. Antilles uses two of these Cayman GPUs, combined computational power of 6 TFLOPs, a total of 3840 stream processors, total memory bandwidth of 307.2 GB/s, a total of 4 GB of memory, load and idle board power ratings at 300W and 30W, respectively.
Source:
3DCenter Forum
Cayman packs 1920 stream processors, spread across 30 SIMD engines, indicating the 4D stream processor architecture, generating single-precision computational power of 3 TFLOPs. It packs 96 TMUs, 128 Z/Stencil ROPs, and 32 color ROPs. Its memory bandwidth of 160 GB/s indicates that it uses a 256-bit wide GDDR5 memory interface. The memory amount, however, seems to have been doubled to 2 GB on the Radeon HD 6970. Antilles uses two of these Cayman GPUs, combined computational power of 6 TFLOPs, a total of 3840 stream processors, total memory bandwidth of 307.2 GB/s, a total of 4 GB of memory, load and idle board power ratings at 300W and 30W, respectively.
134 Comments on AMD Cayman, Antilles Specifications Surface
_____________________________
Antec1200 filter project
seems power limiting is the future of this process right now
is the first slide really fake or is one just october and the other is november oh delayed i see.
@bravesoul
the HD5870 was kinda like a dual core gpu.
Either 1920 SP + 120 TMU
or 1536 SP + 96 TMU
TBH the above is probably the reason that different sources have claimed 1920 SP or 1536 SP, depending on the number they chose to believe (1920 SP or 96 TMU).
Number of color and coverage samples can be independently controlled - That sounds interesting.
.edit. Yes, it can be done on monitor too, but it is a nice feature anyway:)
However, back to the subject at hand. I am a little disappointed about the mem bandwidth, however the other improvements might be enough for the 6970 to compete with the 580. Price will indeed be most people's factor when deciding whether or not to get either flagship card.
For me though, i think i will wait it out till the 7 series with my 5850.
i.e with normal 4xMSAA where color and coverage sample number is the same (4 color&4coverage) if a pixel is between 2 different objects (let's say one is black and the other is white) and 2 out of 4 samples fall in the area that belongs to black object the final pixel will be "50% black", if only one falls in the black object "25% black" and so on.
The problem is that it is posible that the black object occupies 40%+ of the "pixel area" while only one color sample (25%) actually falls within it, so the resulting color (25% black) is not accurate, here is where a higher number of coverage samples come to the rescue. this samples only take care of aproximating how much of the area belongs to object 1 and how much to object 2. Summarizing: color samples determine how many colors/objects are there to choose from and coverage samples determine how much of those colors are mixed up on the final result.
Nvidia has been doing this with the CSAA mode for ages although with fixed color/coverage relations and I think Ati's CFAA mode is the same. Now they are making posible to choose how many of each one developers want to use. A nice addition but honestly, not something that will improve quality a lot and I don't even think many developers will bother programming their own "mix".
I know it's related to Antialiasing. I just think that if i will be able to put "color sample slider" to the Max (or mix them in trial and error way), then it would improve color accuracy of game.
From what you are saying I guess that it's more complicated than that.:(
Thanx for explanation.
I want ! lol
Anyway my reaction:
:confused: Ein? Explain please. What is the relation between the shaders and the polygon power? Theorethical/peak polygon power is twice because Cayman has twice as many vertex/raster engines when compared to previous gen. What does shader config have to do with that?
And no, I didn't quote the wrong post. ;) Make of it what you will. :laugh:
Anyway, my post was about the fact that on Evergreen the theoretical/peak polygon power was not used at all. Real power is much much much lower. Remember that at 1 poly/clock a HD5870 should do 850 million poly/s or ~15 million poly per frame @60fps. The thing is that it does not do that at all, and even AMD was asking for 16 pixel/poly in order to be optimiced. That accounts for less than 0.5 million poly frames, way too far from the peak 15 millions, that's like <5% efficiency. If shaders where the problem they would have not inproved the setup engine, there's a lot of room to increase the efficiency from that 5% all they way up to 99% and looking at the slides it's obvious that Cayman has exactly 2x as much power. We are not talking about a bottleneck on the shaders there, it's a very real 2x improvement based on a very specific 2x increase in the setup engine. It has 14% less shaders, but yeah it's a good point and maybe I exagerated a bit, although the reason I mistakenly exagerated is because I was assuming almost perfect efficiency. To answer your question, the explanation of why that happens is easy. AMD's architecture is not efficient, it's far from being efficient from a utilization POV. Basically it's not the HD6850 which is faster than "it should", it's HD6870 which is not as fast as it should, because it cannot use all it's resources as well as the HD6850. And this is even more true for the HD5870 that with 1600SP "should be" 2x as fast as the HD4890, but it isn't.
It was just a comment anyway and mostly based on the fact that I don't think that 6 SIMDs are needed to be disabled on the first harvested part in order to have good yields. Unless they're horrible horrible horrible. It means they would be getting almost 6 errors per die or like 500 per waffer... come on... no way (or does it?).
5870 doesn't fit in this example, except to show how misbalancing gpu arrangement can lead to real big problems...and the 6870 and it's higher efficiency serves as the basis. The 5870 set-up engine wasn't even good enough to fill 5870 properly...much of the gpu is idle all the time, even in 3D.
But, why was it idle so much? Becuase only one-to-three SPs of the 5 in a grouping ever gets used.
This inefficiency is what precludes the switch to 4-D. But at the same time, scheduling for 5-D shaders is far different from 4D, with higher-order math capabilities...
How does the set-up engine affect 4D? Are you serious? What feeds the shaders? Fairy dust and troll hairs?