1. Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

On NVIDIA's Tile-Based Rendering

Discussion in 'News' started by Raevenlord, Mar 1, 2017.

  1. Raevenlord

    Raevenlord News Editor Staff Member

    Joined:
    Aug 12, 2016
    Messages:
    978 (2.63/day)
    Thanks Received:
    954
    Location:
    Portugal
    Looking back on NVIDIA's GDC presentation, perhaps one of the most interesting aspects approached was the implementation of tile-based rendering on NVIDIA's post-Maxwell architectures. This has been an adaptation of typically mobile approaches to graphics rendering which keeps their specific needs for power efficiency in mind - and if you'll "member", "Maxwell" was NVIDIA's first graphics architecture publicly touted for its "mobile first" design.

    This approach essentially divides the screen into tiles, and then rasterizes the entire frame in a per-tile basis. 16×16 and 32×32 pixels are the usual tile sizes, but both Maxwell and Pascal can dynamically assess the required tile size for each frame, changing it on-the-fly as needed and according to the complexity of the scene. This looks to ensure that the processed data has a much smaller footprint than that of the full image rendering - small enough that it makes it possible for NVIDIA to keep the data in a much smaller amount of memory (essentially, the L2 memory), dynamically filling and flushing the available cache as possible until the full frame has been rendered. This means that the GPU doesn't have to access larger, slower memory pools as much, which primarily reduces the load on the VRAM subsystem (increasing available VRAM for other tasks), whilst simultaneously accelerating rendering speed. At the same time, a tile-based approach also lends itself pretty well to the nature of GPUs - these are easily parallelized operations, with the GPU being able to tackle many independent tiles simultaneously, depending on the available resources.

    [​IMG] [​IMG] [​IMG] [​IMG]

    Thanks to NVIDIA's public acknowledgement on the usage of tile-based rendering strating with its Maxwell architectures, some design decisions on the Maxwell architecture now make much more sense. Below, is a screenshot taken from NVIDIA's "5 Things You Should Know About the New Maxwell GPU Architecture". Take a look at the L2 cache size. From Kepler to Maxwell, the cache size increased 8x, from 256 KB on Kepler to the 2048 KB on Maxwell. Now, we can attribute this gigantic leap in cache size to the need for a higher-size L2 cache so as to fit the required tile-based resources for the rasterizing process, which allowed NVIDIA the leap in memory performance and power efficiency they achieved with the Maxwell architecture compared to its Kepler predecessor. Incidentally, NVIDIA's GP102 chip (which powers the GTX Titan X and the upcoming, recently announced GTX 1080 Ti, doubles that amount of L2 cache again, to a staggering 4096 KB. Whether or not Volta will continue with the scaling of L2 cache remains to be seen, but I've seen worse bets.

    [​IMG] [​IMG]

    An interesting tangent: the Xbox 360 and Xbox One ESRAM chips (running on AMD-architectured GPUs, no less) can make for a substitute for the tile-based rasterization process that post-Maxwell NVIDIA GPUs employ.

    Tile-based rendering seems to have been a key part on NVIDIA's secret-sauce towards achieving the impressive performance-per-watt ratings of their last two architectures, and it's expected that their approach to this rendering mode will only improve with time. Some differences can be seen in the tile-based rendering between Maxwell and Pascal already, with the former dividing the scene into triangles, and the later breaking a scene up into squares or vertical rectangles as needed, so this means that NVIDIA has in fact put in some measure of work into the rendering system between both these architectures.

    Perhaps we have already seen some seeds of this tile-based rendering on AMD's Vega architecture sneak peek, particularly in regards to its next-generation Pixel Engine: the render back-ends now being clients of the L2 cache substitute their previous architectures' non-coherent memory access, in which the pixel engine wrote to the memory controller. This could be AMD's way of tackling the same problem, with AMD's improvements to the pixel-engine with a new-generation draw-stream binning rasterizer supposedly helping to conserve clock cycles, whilst simultaneously improving on-die cache locality and memory footprint.

    [​IMG] [​IMG]

    David Kanter, of Real World Tech, has a pretty interesting YouTube video where he goes in some depth on NVIDIA's tile-based approach, which you can check if you're interested.



    Source: NVIDIA Devblogs, Real World Tech
     
    Last edited: Mar 1, 2017
    Kanan, erixx and Fluffmeister say thanks.
  2. TheLostSwede

    TheLostSwede

    Joined:
    Nov 11, 2004
    Messages:
    1,792 (0.38/day)
    Thanks Received:
    891
    Location:
    Formosa
    Welcome to 2001 Nvidia...
    At least it's good to see that they're finally catching up.
     
    10 Year Member at TPU
  3. Ferrum Master

    Ferrum Master

    Joined:
    Nov 18, 2010
    Messages:
    3,697 (1.50/day)
    Thanks Received:
    2,159
    Location:
    Rīga, Latvia
    Kyro again.
     
    Crunching for Team TPU
  4. Nokiron

    Nokiron

    Joined:
    Feb 16, 2012
    Messages:
    354 (0.18/day)
    Thanks Received:
    227
    Location:
    Sweden
    If Nvidia is welcomed to 2001 with tile-based rasterization, where does this leave AMD? 1995?
     
    1c3d0g and Fluffmeister say thanks.
  5. sutyi

    sutyi

    Joined:
    Nov 20, 2012
    Messages:
    92 (0.05/day)
    Thanks Received:
    60
    Location:
    Budaörs, Hungary
    PowerVR lawsuit coming in 3... 2...
     
  6. londiste

    Joined:
    Feb 3, 2017
    Messages:
    118 (0.60/day)
    Thanks Received:
    33
    no it isn't.

    the way it is done by nvidia (and amd) is different enough that the original set of patents are unlikely to cover any of this.

    also, a lot of mobile gpus do tiled rendering and there has not really been a wave of lawsuits.
     
  7. Brusfantomet

    Joined:
    Mar 23, 2012
    Messages:
    663 (0.34/day)
    Thanks Received:
    224
    Location:
    Norway
    Lots of mobile GPUs are based on the PowerVR design
     
  8. londiste

    Joined:
    Feb 3, 2017
    Messages:
    118 (0.60/day)
    Thanks Received:
    33
    yup, powervr is significant in mobile space.
    but not only their gpus do tiles, perhaps most notably arm's mali and qualcomm's adreno should both do tile-based rendering.
     
  9. TheLostSwede

    TheLostSwede

    Joined:
    Nov 11, 2004
    Messages:
    1,792 (0.38/day)
    Thanks Received:
    891
    Location:
    Formosa
    PowerVR did tiled based rendering with their Kyro chips around 2001. AMD has as far as I'm aware been doing some form of tile based rendering for quite some time.
     
    Prima.Vera says thanks.
    10 Year Member at TPU
  10. Nokiron

    Nokiron

    Joined:
    Feb 16, 2012
    Messages:
    354 (0.18/day)
    Thanks Received:
    227
    Location:
    Sweden
    I don't think AMD has ever used it in a desktop-product. The Adreno-products did.
     
    1c3d0g says thanks.
  11. Solidstate89

    Solidstate89

    Joined:
    May 29, 2012
    Messages:
    406 (0.21/day)
    Thanks Received:
    135
    You would be wrong. Which makes your nVidia comment all the more hilarious as they are the first manufacturer to implement this on GPU chips outside of the mobile environment.
     
    1c3d0g says thanks.
  12. Steevo

    Steevo

    Joined:
    Nov 4, 2005
    Messages:
    9,817 (2.28/day)
    Thanks Received:
    2,269
    Your special kid.

    "An interesting tangent: the Xbox 360 and Xbox One ESRAM chips (running on AMD-architectured GPUs, no less) can make for a substitute for the tile-based rasterization process that post-Maxwell NVIDIA GPUs employ."
     
    10 Year Member at TPU 10 Million points folded for TPU
  13. ZoneDymo

    ZoneDymo

    Joined:
    Feb 11, 2009
    Messages:
    1,607 (0.52/day)
    Thanks Received:
    565
    So is this more proof these gpu makers are sitting on a bunch of tech they COULD put in their new gpu and send us light years ahead in tech but dont because feeding it to the public in piecemeal portions means more money?
     
  14. Nokiron

    Nokiron

    Joined:
    Feb 16, 2012
    Messages:
    354 (0.18/day)
    Thanks Received:
    227
    Location:
    Sweden
    That's not proper tile-based rasterization though. And is it says, it's a substitute which is way slower.
     
    Solidstate89 says thanks.
  15. Solidstate89

    Solidstate89

    Joined:
    May 29, 2012
    Messages:
    406 (0.21/day)
    Thanks Received:
    135
    You're**
     
    Steevo says thanks.
  16. TheLostSwede

    TheLostSwede

    Joined:
    Nov 11, 2004
    Messages:
    1,792 (0.38/day)
    Thanks Received:
    891
    Location:
    Formosa
    Right, I guess they did it for mobile, but never desktop. Interesting. Trident did it (not sure the chip ever went into mass production though), but I guess no-one remembers them any more...
    It also looks like PowerVR did it all the way back in 1996 when they started doing GPUs.

    So welcome to 1996 Nvidia...
     
    10 Year Member at TPU
  17. prtskg

    Joined:
    Feb 25, 2016
    Messages:
    58 (0.11/day)
    Thanks Received:
    27
    Where is the slower part written? May be I need to read the article again!
     
  18. Nokiron

    Nokiron

    Joined:
    Feb 16, 2012
    Messages:
    354 (0.18/day)
    Thanks Received:
    227
    Location:
    Sweden
    The ESRAM in the Xbox One is inherently extremely slow compared to the low level cache found in a desktop GPU. That should really speak for itself.

    They didn't say the last part though, i did. Not a native speaker.
     
    rtwjunkie says thanks.
  19. Fluffmeister

    Fluffmeister

    Joined:
    Dec 22, 2011
    Messages:
    1,964 (0.95/day)
    Thanks Received:
    1,092
    No comment. :D
     
    Steevo says thanks.
  20. efikkan

    Joined:
    Jun 10, 2014
    Messages:
    655 (0.56/day)
    Thanks Received:
    281
    Tiled rendering is one of several techniques which helps improve the efficiency of Maxwell/Pascal.
    Generally this gives two great benefits:
    - Tiles are rendered completely, instead of the screen rendering each pixel partially several times. This saves the data from taking several round-trips between the GPU and memory, which saves a lot of memory bandwidth.
    - Lower risk of data hazards (multiple sections needing the same texture), so less stalls, improving GPU efficiency.
    - Being cache local, reducing stalls, again improving GPU efficiency.

    BTW; I recommend watching the referenced Youtube video in the article, it's visuals are good so even the non-programmers among you should be able to get the idea.
     
  21. erixx

    erixx

    Joined:
    Mar 24, 2010
    Messages:
    4,482 (1.66/day)
    Thanks Received:
    883
    Location:
    Dutch in Spain
    Then don't forget the good ole voxels! (Efficient Sparse Voxel Octrees) Novalogic was so futuristic with that.
     
  22. Steevo

    Steevo

    Joined:
    Nov 4, 2005
    Messages:
    9,817 (2.28/day)
    Thanks Received:
    2,269
    Gotta love auto correct on mobile devices.

    The PS4 is a regular chip with caches, almost identical to the XBox1, just GDDR and caches. They are relatively quick, mostly held back by CPU cores that were not zen.

    It's been known for awhile, at least since August of last year.

    https://www.extremetech.com/gaming/...ets-of-nvidia-maxwell-pascal-power-efficiency


    It's possible the tile based rendering will explain some of the artifacts they produce when running certain effects.
     
    Last edited: Mar 1, 2017
    10 Year Member at TPU 10 Million points folded for TPU
  23. Super XP

    Super XP

    Joined:
    Mar 23, 2005
    Messages:
    2,985 (0.66/day)
    Thanks Received:
    607
    Location:
    Ancient Greece, Acropolis (Time Lord)
    It leaves AMD ahead of the game, in 2018,
     
    10 Year Member at TPU
  24. Kanan

    Kanan

    Joined:
    Aug 22, 2015
    Messages:
    2,731 (3.75/day)
    Thanks Received:
    1,471
    Location:
    Europe
    Don't forget the epic Voxel game "Outcast", it was completely processed by the CPU in a time where 3D graphics were the newest and greatest shit :D
     
    erixx says thanks.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guest)