Monday, November 22nd 2010

Cayman Confirmed To Be Using VLIW4 SP Arrangement, Redesigned ROPs

With the introduction of AMD's Radeon HD 6000 series GPUs, we were made to expect a massive architectural change in the way AMD arranges its unified shaders. That, however, didn't happen with the Radeon HD 6800 series based on the 40 nm "Barts" GPU, which continued to maintain the VLIW5 configuration (comprising of SIMD units with 4 simple and 1 complex stream processing units). A recent presentation leaked to the internet reveals that the much talked about architectural change was saved for Cayman, the company's upcoming high-end GPU, on which will be based Radeon HD 6900 series graphics cards.

In VLIW4 architecture, equipotent stream processing units are arranged in groups of four along with general purpose registers. Although the four have equal capabilities, two out of four of these (occupying 3 and 4 issue slots) are assigned with some special functions. AMD looks to be conservative with the benefits of the new SIMD architecture. It claims that VLIW4 gives similar computational power as VLIW5, with 10% reduction in die area. It also simplifies scheduling.

The presentation also provided a glimpse of the overall architecture schematic of Cayman, which reveals a greater level of parallelization compared to Cypress (Radeon HD 5800 series, 5970). While Barts was a step up from Cypress architecture in assigning individual dispatch processors for each of the two SIMD Engine blocks (read further here), Cayman looks to take that a step further with two graphics processing engines (GPEs), and assigning each to an SIMD engine block. That effectively means that there are two physical tessellation units on Cayman. Barts, while using a single tessellation unit, improved its efficiency to increase tessellation performance by up to 2x compared to previous generation (or so claimed AMD). With Cayman having two of these, it could mean a tessellation performance increase by 3-4x compared to previous generation.
Cayman also features reworked render backends consisting of 128 Z/Stencil ROPs, and 32 color ROPs, with up to 2x faster 16-bit integer operations and 2-4x faster 32-bit floating point operations.


Source: NGOHQ
Add your own comment

54 Comments on Cayman Confirmed To Be Using VLIW4 SP Arrangement, Redesigned ROPs

#1
TheMailMan78
Big Member
Weeeee Redesigned ROPs. Can't wait to see benches.
Posted on Reply
#2
Lionheart
Well that answers my complaint about ROPS before :)
Posted on Reply
#3
Bjorn_Of_Iceland
Wow this new architecture is exciting, hope it has great performance for good price! Everyone seemed to be on the tessellation rush these days
Posted on Reply
#5
meirb111
from what i remember when anyone said anything about the 6800
low shader numbers people said that the 6800 shaders are new design
well in here it says 6800 shaders are not new design
Posted on Reply
#6
alexsubri
Can't wait to see the 6990s render tessellation in the Heaven 2.1 Benchies! My 5850 took a beating when it was running at max settings :cry:
Posted on Reply
#7
TheMailMan78
Big Member
by: alexsubri
Can't wait to see the 6990s render tessellation in the Heaven 2.1 Benchies! My 5850 took a beating when it was running at max settings :cry:
Its just a bench man. In game is all that counts.
Posted on Reply
#8
theoneandonlymrk
Ooooo they might tempt me from my wc 5870 yet vliw4 ant 2xgfx engine with 860 shaders a piece in one chip nice and at least some way along the path of 2000 shaders and a four cour gfx chip i proposed/guesed as maybe poss
Posted on Reply
#9
qubit
Overclocked quantum bit
Looks like it's gonna kick the GTX 580 all around the playground. :rockout:

Looks like we're finally gonna see some head-on competition from the big boys and possibly a price war. Bring it on!
Posted on Reply
#10
_Flare
So we have to think of a 6800´s Tesselationthroughput and simply double it, he ??
6800 = x2 ; 6900 = x4
!! PLUS the decent overall benefit of THE NEW architecture !!

InGame the 6870 ist a bit slower than the 5870, does that mean
the 6970 will be a bit slower than the 5970 ??
Posted on Reply
#11
yogurt_21
by: qubit
Looks like it's gonna kick the GTX 580 all around the playground. :rockout:

Looks like we're finally gonna see some head-on competition from the big boys and possibly a price war. Bring it on!
well yeah if "special tasks" equates to complex shader tasks then we may be seeing 2 complex shaders with 2 moderatly complex which would double complex shader performance and increase both minimum frames in all games and overall frames in games such as metro 2033 and the like.

I know it's laymens terms and is undercomplicating what is a complicated architecture, but if that's true it is going to stomp all over the 580, even if it kept the same or last over shaders as the 5870, so the 1536 shaders/96 tmu number could be correct and this would still be loads faster than the 5870. Though peak frames may be similar, minimum and thus average go up making for a better gaming experience all around.
Posted on Reply
#12
KainXS
by: _Flare
So we have to think of a 6800´s Tesselationthroughput and simply double it, he ??
6800 = x2 ; 6900 = x4
!! PLUS the decent overall benefit of THE NEW architecture !!

InGame the 6870 ist a bit slower than the 5870, does that mean
the 6970 will be a bit slower than the 5970 ??
no its a different architecture and we don't know the clocks yet, its gonna come down to the clocks this time
Posted on Reply
#13
Steevo
If they get better utilization of the shaders where two or three would remain unused on games, and two polys per clock, and even a 5% improvement I would guess at least a 15% per clock improvement overall at the same number of shaders, but with the higher number of shaders we could have a 30% improvement in performance easily. So if a improvement of 30% over the 6870...... at higher clocks.
Posted on Reply
#14
HalfAHertz
Depending on the frequency of the core, my prediction is that it will land just under gtx580 performance levels, but it will do so with a <400mm^2 die and a smaller TDP. Nvidia will still have the fasted single core card but it will be a sort of a Pyrrhic victory...

AMDs aforementioned TDP limiter will play a big role here tho. It will be a make or break feature.
Posted on Reply
#15
JATownes
I knew it was a good idea to wait for the 6970. I have been sitting on cash waiting for these things to drop. I am guessing $350-$400, I hope. I will take two please. :D
Posted on Reply
#16
mechtech
But will it play CS 1.6??? ;)
Posted on Reply
#17
Vancha
by: mechtech
But will it play CS 1.6??? ;)
Woah now. Lets not get ahead of ourselves.
Posted on Reply
#18
wolf
Performance Enthusiast
equipotent
bear my first born son btarunr.
Posted on Reply
#19
pantherx12
by: meirb111
from what i remember when anyone said anything about the 6800
low shader numbers people said that the 6800 shaders are new design
well in here it says 6800 shaders are not new design
Those were rumours it is 5800 is 5d 6900 is 4d.

I really hope this isn't another 2900 with epic specs and disappointing results :laugh:


Cos I'm excited to see AMD potentially competing and maybe even having the top end spot for once.
Posted on Reply
#20
wolf
Performance Enthusiast
by: pantherx12
Cos I'm excited to see AMD potentially competing and maybe even having the top end spot for once.
they have the possible advantage here since the GTX580 already launched, and they know its performance, they can clock this chip to compete, since clockspeeds don't seem to be finalised yet.
Posted on Reply
#21
TheMailMan78
Big Member
by: wolf
bear my first born son btarunr.
I don't even know WTF is going on here but I lol'd
Posted on Reply
#22
pantherx12
by: wolf
they have the possible advantage here since the GTX580 already launched, and they know its performance, they can clock this chip to compete, since clockspeeds don't seem to be finalised yet.
Aye I did think that was the real reason for delay rather than a component shortage actually.

I mean they could pull an extra 12% performance out of their arse ( for lack of a better phrase) Just from going from 800 to 900 core.

Would the increased triangle per clock mean improved benefits for tessellation with over-clocking? ( I.E is it like overclocking gddr5? you get 4x the powa!, only with this 2x the powa! )
Posted on Reply
#23
n-ster
I hope the 69XX suck so that I feel less worse for buying a 6870 instead of waiting
Posted on Reply
#24
Fourstaff
by: TheMailMan78
Its just a bench man.
:shadedshu I expect a formal resignation from TPU now.

When do we see the actual cards again?
Posted on Reply
#25
TheMailMan78
Big Member
by: Fourstaff
:shadedshu I expect a formal resignation from TPU now.

When do we see the actual cards again?
Benches are nice but um don't we game and fold more then bench?
Posted on Reply
Add your own comment