ArchitectureThe GeForce GTX 1080 is based on NVIDIA's second biggest GPU based on the "Pascal" architecture, the GP104. The biggest GPU is the GP100 driving the Tesla P100 HPC processor. The GP104 succeeds the GM204 (GTX 980, GTX 970), and despite having a smaller die at 314 mm² when compared to the 398 mm² of the GM204, it does feature significantly higher transistor counts at 7.2 billion when compared to the 5.2 billion of the GM204. This is due to NVIDIA's big move to the 16 nm FinFET process.
With each successive architecture since "Fermi," NVIDIA has been enriching the streaming multiprocessor (SM) by adding more dedicated resources and reducing shared resources within the graphics processing cluster (GPC), which leads to big performance gains. The story continues with "Pascal." Like the GM204 before it, the GP104 features four GPCs, super-specialized subunits of the GPU that share the PCI-Express 3.0 x16 host interface and the 256-bit GDDR5X memory interface through eight controllers. These controllers support both GDDR5X and GDDR5 memory, and the upcoming GeForce GTX 1070 is expected to feature 7 Gbps GDDR5 memory instead of the 10 Gbps GDDR5X on the GTX 1080.
Workload across the four GPCs is shared by the GigaThread Engine cushioned by 2 MB of cache. Each GPC holds five streaming multiprocessors (SMs), which is an increase from the four SMs each GPC held on the GM204. The GPC shares a raster engine between these five SMs. The "Pascal" streaming multiprocessor features a 4th generation PolyMorph Engine, a component for key render setup operations. With "Pascal," the PolyMorph Engine includes specialized hardware for the new Simultaneous MultiProjection feature. Each SM also holds a block of eight TMUs.
Each SM continues to feature 128 CUDA cores. The GP104 hence features a total of 2,560 CUDA cores. Other vital specifications include 160 TMUs and 64 ROPs. NVIDIA claims to have worked on a new GPU internal circuit design and board channel paths to facilitate significantly higher clock speeds than what the GM204 is capable of. The GeForce GTX 1080 ships with a staggering 1607 MHz GPU clock speed for a maximum GPU Boost frequency of 1733 MHz. At its launch event earlier this month, NVIDIA demonstrated a 2.1 GHz GPU overclock on the reference-design board with stock air-cooling, so we know this GPU likes to overclock.
The GeForce GTX 1080 is the first graphics card to use the new GDDR5X memory standard. The interface enables effective data-rates that are as high as 14 GHz, and although it has many bare-metal specifications in common with GDDR5, minimizing R&D for its implementation, the memory chip design is improved with higher pin counts to support these higher data-rates. The memory is clocked at an effective 10 GHz. Over a 256-bit memory interface, this works out to a memory bandwidth of 320 GB/s NVIDIA has also optimized the usage of with more advanced 4th generation lossless Delta Color Compression. The best-case scenario has Delta Color Compression provide an "effective" memory bandwidth uplift of 20 percent, which results in 384 GB/s.
The "Pascal" architecture supports Asynchronous Compute as standardized by Microsoft. It adds to that with its own variation of the concept with "Dynamic Load Balancing."