ArchitectureNVIDIA's primary design goal with the GM107 is to up the performance-per-watt game. If NVIDIA achieves a significant performance-per-watt gain with "Maxwell" over "Kepler," it can trade that gain for performance on bigger "Maxwell" chips bolstered by the 20 nm process. It's this fact that makes the GK107 academically interesting.
As we mentioned earlier, the GM107 essentially features the same component hierarchy as previous generation "Kepler" GPUs, but introduces changes to the design of the streaming multiprocessor (SMM), the parallel-processing sub-unit of the GPU. At its outermost ring, the GM107 features the GigaThread Engine, a component that marshals data and instructions between the graphics processing cluster (GPC), the raster operations processors (ROPs), the L3 cache, the memory controllers, the bus interface, and the display I/O.
Several GPCs can typically be routed to a GigaThread engine, but being a mid-range GPU, the GM107 features just one. This GPC features a raster engine that handles high-level assembly of data and instructions and five streaming multiprocessors (SMMs), which is where the number crunching takes place. Unlike the streaming multiprocessors (SMXs) of "Kepler" GPUs, which feature an incoherent group of 192 CUDA cores, the SMM features four groups of 32 cores each, which totals 128 per SMM. The SMM shares a Polymorph Engine that features components such as the tessellator, fetch, setup, transform, and stream output with the four groups. The four groups of 32 CUDA cores, each, feature dedicated warp schedulers and registers, with a texture cache cushioning transfers between the groups and TMUs. The GM107 hence features a total of 640 CUDA cores and 48 TMUs. On the GTX 750, one of the five SMMs is disabled. The card hence features a total of 512 CUDA cores and 40 TMUs. At a higher level, the chip features 16 color ROPs and a 128-bit wide GDDR5 memory interface. The other big difference is the memory amount. The GTX 750 only features 1 GB.