You're probably right on the schedulers but I'd guess that most of the improvement came from Cayman now having as a base 480 shaders instead of 320.
But that's the point I was making. I don't think Cayman will have 480 stream processors. 4D VLIW processors are as fast as 5D ones while being 10% smaller, AMD says.
Going by that it's literally imposible for Cayman to have 50% more stream processors, because that would mean 50% bigger die. Minus the 10% improvement in die area, that would mean that it is 40% bigger, that is much more than 450mm^2 and it is said to be < 400 mm^2 (~380 mm^2 seems to be the most common rumor).
Based on either Barts (224 processors & 255 mm^2) or Cypress (320 SP & 336 mm^2), and once we take the 10% improvement into account, roughtly the maximum number of stream processors that a 400mm^2 Cayman could have is 416. That's 26 SIMD, but I actually think the real Cayman will have 24 SIMD (that'd make it =
80 mm^2), mainly because it fits the rumors of Cayman having 96 TMU. That's actually a lot depending on how well the schedulers handle them compared to Cypress. If they do it as well as Barts (I don't think so tho) you have 24 SIMD compared to 14 SIMD on Barts, a 70% improvement. I'm far more realistic than that so I would expect only a little better scaling than Cypress, so I'd expect a 40% improvement over Barts. I won't believe that scheduling efficiency has improved so much until I see it.
That is a really hard estimate indeed. We know that one 4D shader unit is 10% smaller than the old one and we know the die area of the old GPUs, but we don't really know the ration of the i/o system:shaders. Based on your calculations, it looks like you're thinking that the shaders take only a marginally part of the GPU (~66% ?) , but what if in reality they contribute for 80-90% of the die area?
For example (I'll start pulling numbers out of thin area because I don't know those ratios either) let's take the 5870. It has 1600sps and a die area of 334mm^2. Let's say that the SPs are 75% of that, that's ~250mm^2. Add 20% more shades and you end up at ~300, minus 10% because of the 4D reduction leaves you with 270mm^2. That plus a beefed up i/o system means 270+84*1,5=396mm^2. Of course I think these numbers are complete bull and don't stand behind them. But so are your numbers.
I am only trying to make a point that we have yet to see some recent die shots from ati, so we won't really know until the thing is
finally released unfortunately
That is true, but I'm just making some rough estimates based on AMD's claim and estimating what Cayman cannot be, rather than what it can be. You are portraying an even worse case, and you could be right, but that was not my intention. Certainly ROPs are twice as wide on Cayman, I don't know how that could affect their actual size. memory controler is faster than Cypress, remember that the MC on Barts is half size than Cypress because it's meant for lower memory, so Cayman MC will be even bigger than CYpress? idk but everything adds up a little and makes my point even stronger. I don't think Cayman can have 1920 SP, but that's all that I can say.
The slide says >20 SIMD, translated from marketing talk to plain english that tells me 2 things. They probably have bad yields and they don't know if they will harvest like Nvidia did with GTX480. And second, the actual number of SIMDs is not much higher than on Cypress, otherwise they would have said >24 or >28 or whatever, something that sounds far less mediocre than "ok our new shaders are just as fast as old ones and we will have some more of them". Of course it's only my opinion, but remember we are talking about marketing slides, if the chip had 30 or in the high 20s, they would mention it.
Also don't forget that this is a "dual engine gpu", so you get 2x the Geometry/raster/vertex units(and I'm guessing 2 dispatch units per SIMD tier, for a total of 4?) to feed the SIMDs, so it is in fact more like 15 SIMDS per block of i/o uints than 30 which puts it closer to the 6800 series
Cypress and Barts also have dual engine and dual rasterizer in a sense, both share a single triangle setup logic,, so maybe you can't call them a complete dual engine, but they are mostly there. Maybe the slides are not telling the whole story, but no, from what it looks like there's still only 2 dispatchers.