Monday, November 22nd 2010
Cayman Confirmed To Be Using VLIW4 SP Arrangement, Redesigned ROPs
With the introduction of AMD's Radeon HD 6000 series GPUs, we were made to expect a massive architectural change in the way AMD arranges its unified shaders. That, however, didn't happen with the Radeon HD 6800 series based on the 40 nm "Barts" GPU, which continued to maintain the VLIW5 configuration (comprising of SIMD units with 4 simple and 1 complex stream processing units). A recent presentation leaked to the internet reveals that the much talked about architectural change was saved for Cayman, the company's upcoming high-end GPU, on which will be based Radeon HD 6900 series graphics cards.
In VLIW4 architecture, equipotent stream processing units are arranged in groups of four along with general purpose registers. Although the four have equal capabilities, two out of four of these (occupying 3 and 4 issue slots) are assigned with some special functions. AMD looks to be conservative with the benefits of the new SIMD architecture. It claims that VLIW4 gives similar computational power as VLIW5, with 10% reduction in die area. It also simplifies scheduling.The presentation also provided a glimpse of the overall architecture schematic of Cayman, which reveals a greater level of parallelization compared to Cypress (Radeon HD 5800 series, 5970). While Barts was a step up from Cypress architecture in assigning individual dispatch processors for each of the two SIMD Engine blocks (read further here), Cayman looks to take that a step further with two graphics processing engines (GPEs), and assigning each to an SIMD engine block. That effectively means that there are two physical tessellation units on Cayman. Barts, while using a single tessellation unit, improved its efficiency to increase tessellation performance by up to 2x compared to previous generation (or so claimed AMD). With Cayman having two of these, it could mean a tessellation performance increase by 3-4x compared to previous generation.Cayman also features reworked render backends consisting of 128 Z/Stencil ROPs, and 32 color ROPs, with up to 2x faster 16-bit integer operations and 2-4x faster 32-bit floating point operations.
Source:
NGOHQ
In VLIW4 architecture, equipotent stream processing units are arranged in groups of four along with general purpose registers. Although the four have equal capabilities, two out of four of these (occupying 3 and 4 issue slots) are assigned with some special functions. AMD looks to be conservative with the benefits of the new SIMD architecture. It claims that VLIW4 gives similar computational power as VLIW5, with 10% reduction in die area. It also simplifies scheduling.The presentation also provided a glimpse of the overall architecture schematic of Cayman, which reveals a greater level of parallelization compared to Cypress (Radeon HD 5800 series, 5970). While Barts was a step up from Cypress architecture in assigning individual dispatch processors for each of the two SIMD Engine blocks (read further here), Cayman looks to take that a step further with two graphics processing engines (GPEs), and assigning each to an SIMD engine block. That effectively means that there are two physical tessellation units on Cayman. Barts, while using a single tessellation unit, improved its efficiency to increase tessellation performance by up to 2x compared to previous generation (or so claimed AMD). With Cayman having two of these, it could mean a tessellation performance increase by 3-4x compared to previous generation.Cayman also features reworked render backends consisting of 128 Z/Stencil ROPs, and 32 color ROPs, with up to 2x faster 16-bit integer operations and 2-4x faster 32-bit floating point operations.
54 Comments on Cayman Confirmed To Be Using VLIW4 SP Arrangement, Redesigned ROPs
All of AMD's new cards use these parts. If there was a real shortage, there'd be no 6780/6850's, either. Oh, and GTX580 too.
Whatever the reason I'm sure AMD has no reason to be 100% honest about it as the more they say the more likely it could be to lead to negative press.