Huh. Very interesting, those diagrams. Seems that each maker has positioned thier tesselator in different locations within the rendering path, although ATI's diagram is quite abstract, whereas the Fermi diagram almost resembles a die shot.
That might translate in programmers having to chose progamming ideal for one, and not the other...ugh...maybe ATI's new gen can overcome that.
Those diagrams might kinda show why tesselation comes at such a performance drop, currently, too...seems part of nVidia's strength here is due to the tesselators location within the rendering path, and not just the number of units.
Thanks again for the wonderful explanation!