Monday, November 22nd 2010

Cayman Confirmed To Be Using VLIW4 SP Arrangement, Redesigned ROPs

With the introduction of AMD's Radeon HD 6000 series GPUs, we were made to expect a massive architectural change in the way AMD arranges its unified shaders. That, however, didn't happen with the Radeon HD 6800 series based on the 40 nm "Barts" GPU, which continued to maintain the VLIW5 configuration (comprising of SIMD units with 4 simple and 1 complex stream processing units). A recent presentation leaked to the internet reveals that the much talked about architectural change was saved for Cayman, the company's upcoming high-end GPU, on which will be based Radeon HD 6900 series graphics cards.

In VLIW4 architecture, equipotent stream processing units are arranged in groups of four along with general purpose registers. Although the four have equal capabilities, two out of four of these (occupying 3 and 4 issue slots) are assigned with some special functions. AMD looks to be conservative with the benefits of the new SIMD architecture. It claims that VLIW4 gives similar computational power as VLIW5, with 10% reduction in die area. It also simplifies scheduling.
The presentation also provided a glimpse of the overall architecture schematic of Cayman, which reveals a greater level of parallelization compared to Cypress (Radeon HD 5800 series, 5970). While Barts was a step up from Cypress architecture in assigning individual dispatch processors for each of the two SIMD Engine blocks (read further here), Cayman looks to take that a step further with two graphics processing engines (GPEs), and assigning each to an SIMD engine block. That effectively means that there are two physical tessellation units on Cayman. Barts, while using a single tessellation unit, improved its efficiency to increase tessellation performance by up to 2x compared to previous generation (or so claimed AMD). With Cayman having two of these, it could mean a tessellation performance increase by 3-4x compared to previous generation.
Cayman also features reworked render backends consisting of 128 Z/Stencil ROPs, and 32 color ROPs, with up to 2x faster 16-bit integer operations and 2-4x faster 32-bit floating point operations.
Source: NGOHQ
Add your own comment

54 Comments on Cayman Confirmed To Be Using VLIW4 SP Arrangement, Redesigned ROPs

#26
Fourstaff
TheMailMan78Benches are nice but um don't we game and fold more then bench?
Sure you do, but to say "its just a bench" is a bit like saying "why do we upgrade" in TPU, I thought you understand that.
Posted on Reply
#27
TheMailMan78
Big Member
FourstaffSure you do, but to say "its just a bench" is a bit like saying "why do we upgrade" in TPU, I thought you understand that.
:laugh: I do man. Just a little melancholy today. Sorry.
Posted on Reply
#28
DRDNA
I care about the benchies! I care mostly about the benchies and then about some games but mostly about the benchies!:toast:
Posted on Reply
#29
f22a4bandit
Good thing I have a part time job. I can actually save up money for one of these 69xx bad boys! I'm excited to see the performance of these cards, and of course W1zz's reviews!
Posted on Reply
#30
Unregistered
They also say that they double the tessellation performance. I really want to see that!



Some additional pics:





And new type of AA. This is interesting:
Posted on Edit | Reply
#31
OneCool
Should be fun to see W1z review :)


There better be a review with crossfire too :mad:
Posted on Reply
#32
HalfAHertz
It would be cool if AMD could use the TDP limiter in a way similar to the one on Bulldozer.

It's basically like an advanced turbo core technology and the principle is pretty straight forward: If you can change frequency fast enough and if your workload is varied, there are certain moments where you can raise frequency for "free" while at the same time still be under the TDP limit.
The only problem here would of course be uneven performance spikes you would get, so that's probably a bad idea for games, where you want steady fps, but a great idea for HPC programs where you aim for maximum efficiency...
Posted on Reply
#33
bear jesus
n-sterI hope the 69XX suck so that I feel less worse for buying a 6870 instead of waiting
Just sell it :p

I intend to sell my 6870's and buy a 6970 so i can get back to using a single card/chip, even before knowing the performance of it the 6970 is the only real option for me as i want a single card/chip to run 3 monitors and the fact that a 6870 is almost enough it just needs to be slightly faster with more memory for my needs..
Posted on Reply
#34
f22a4bandit
I wonder if the 69xx series will improve upon F@H for all of those members that are concerned with that. I know that AMD (formally ATI) cards lack the power in that application, and it'd be nice to those that do prefer AMD over Nvidia to have a card that can perform on par in F@H. A level playing field gives power to the consumer.
Posted on Reply
#35
bear jesus
f22a4banditI wonder if the 69xx series will improve upon F@H for all of those members that are concerned with that. I know that AMD (formally ATI) cards lack the power in that application, and it'd be nice to those that do prefer AMD over Nvidia to have a card that can perform on par in F@H. A level playing field gives power to the consumer.
I have almost given up on the idea of ATI/AMD cards being useful for folding until a client comes out that can put any of the cards form the past several generations to use.

I want an nvidia gpu for my htpc just so i can do a useful amount of folding.
Posted on Reply
#36
alexsubri
So wait, if it's under NDA until 22nd Nov (Here in Jersey its 8:27 PM EST) , but when will we see the benchies? After NDA or they still haven't sent the cards out yet to the reviewers (TPU, HardOCP, etc)
Posted on Reply
#37
DriedFrogPills
f22a4banditI wonder if the 69xx series will improve upon F@H for all of those members that are concerned with that. I know that AMD (formally ATI) cards lack the power in that application, and it'd be nice to those that do prefer AMD over Nvidia to have a card that can perform on par in F@H. A level playing field gives power to the consumer.
this will only happen when Stanford finally get around to writing an OpenCL client, GPU3 currently is CUDA based GPU2 only uses about 320 SP's on ATI cards
Posted on Reply
#38
wolf
Performance Enthusiast
TheMailMan78I don't even know WTF is going on here but I lol'd
he used one of the coolest words I have ever heard (equipotent) so he must be mine forever.
Posted on Reply
#39
crazyeyesreaper
Not a Moderator
NDA was extended to Dec 13th i thought? or somewhere around that time
Posted on Reply
#40
alexsubri
crazyeyesreaperNDA was extended to Dec 13th i thought? or somewhere around that time
They better not delay it again! I can't wait to see these beast's
Posted on Reply
#41
n-ster
bear jesusJust sell it :p

I intend to sell my 6870's and buy a 6970 so i can get back to using a single card/chip, even before knowing the performance of it the 6970 is the only real option for me as i want a single card/chip to run 3 monitors and the fact that a 6870 is almost enough it just needs to be slightly faster with more memory for my needs..
I'm receiving my card Friday :p

hopefully 6870s in CFX do respectably vs the price they cost
Posted on Reply
#42
f22a4bandit
bear jesusI have almost given up on the idea of ATI/AMD cards being useful for folding until a client comes out that can put any of the cards form the past several generations to use.

I want an nvidia gpu for my htpc just so i can do a useful amount of folding.
DriedFrogPillsthis will only happen when Stanford finally get around to writing an OpenCL client, GPU3 currently is CUDA based GPU2 only uses about 320 SP's on ATI cards
Oh okay, didn't know the technical aspects of the client, thanks for the heads up!
Posted on Reply
#43
bear jesus
n-sterI'm receiving my card Friday :p

hopefully 6870s in CFX do respectably vs the price they cost
Oh i didn't realize :laugh: for a while i only had one card in use and to be honest i have been surprised at how well a single card runs the games i play at 5040x1050 and for most of the games crossfire is overkill.

Also after a quick google there is some sites showing 6870 crossfire vs 580 reviews and the 6870 crossfire beats the 580 in multiple games, not in every game but enough to say that 6870 crossfire is around the performance of a 580 and going by prices form scan (I'm on the uk) it's cheaper £365 for 6870 crossfire and £400 for a 580 so I'm pretty sure you will be happy with your purchase.
Posted on Reply
#44
Unregistered
so this power containment thing is if we let said use amd overdrive to change the tdp for 170watt then the card will adjust the card core clock and sp to meet the tdp?
#45
Hayder_Master
i hope it's have better performance with high AA too, the 5xxx series fail with 32AA and sometimes even with 16AA
if they success with this old weak points it will be great, but it's still one more thing they should think about it which is physics
Posted on Reply
#46
bear jesus
hayder.masteri hope it's have better performance with high AA too, the 5xxx series fail with 32AA and sometimes even with 16AA
if they success with this old weak points it will be great, but it's still one more thing they should think about it which is physics
Would not most of the cards run out of memory when using things like x32AA? Also would it not make more sense to be using opencl based physics thus it's not really down to AMD as they already support it and so do nvidia.
Posted on Reply
#47
Bjorn_Of_Iceland
TheMailMan78Its just a bench man. In game is all that counts.
You can always play heaven and imagine there are some enemies.. like what i do sometimes :(
Posted on Reply
#48
Over_Lord
News Editor
TheMailMan78Weeeee Redesigned ROPs. Can't wait to see benches.
no wonder they are sticking for old 5GT memory if the previous news is correct
Posted on Reply
#49
Steevo
hayder.masteri hope it's have better performance with high AA too, the 5xxx series fail with 32AA and sometimes even with 16AA
if they success with this old weak points it will be great, but it's still one more thing they should think about it which is physics
It has been shown that Physics runs fine on a CPU, even Physx will run just as fast on a CPU as it does on a GPU when they user current coding. With the current move to multi-core processors running a dedicated thread for physics is a simple fix, or just adhering to the DX11 standard would be great.

As resolution increases the need for AA decreases, and the form in which AA is performed needs to evolve, no more overdrawing, cause at anything above 1680 it becomes too burdensome to perform with any sort of performance.
Posted on Reply
#50
bogie
So when is the HD6970 gonna be released?
Posted on Reply
Add your own comment
Apr 25th, 2024 15:02 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts