• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,670 (7.43/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
The GeForce RTX 2080 Ti is indeed based on an ASIC codenamed "TU102." NVIDIA was referring to this 775 mm² chip when talking about the 18.5 billion-transistor count in its keynote. The company also provided a breakdown of its various "cores," and a block-diagram. The GPU is still laid out like its predecessors, but each of the 72 streaming multiprocessors (SMs) packs RT cores and Tensor cores in addition to CUDA cores.

The TU102 features six GPCs (graphics processing clusters), which each pack 12 SMs. Each SM packs 64 CUDA cores, 8 Tensor cores, and 1 RT core. Each GPC packs six geometry units. The GPU also packs 288 TMUs and 96 ROPs. The TU102 supports a 384-bit wide GDDR6 memory bus, supporting 14 Gbps memory. There are also two NVLink channels, which NVIDIA plans to later launch as its next-generation multi-GPU technology.



View at TechPowerUp Main Site
 
Correction nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
@btarunr
 
Last edited:
I'm assuming : "Geometry units" are Polymorph Engines ?
2080 Ti = GTX 570 (from cut downs).
 
They are going to be milking varients of this chip for a few years im sure.
 
They are going to be milking varients of this chip for a few years im sure.


We can still expect a Titan Series for a hefty price of around $3000.
 
Any inform on Async on Turing Arch?
It should do async really well.Steve from gamers nexus said it's very asynchronous in its nature, and this graph confirms it. Only game that hits over 1.5x performance is one with async (wolfenstein, also disregard the two last ones with HDR, this is more like regained performance that pascal lost in HDR)

PduCp2U2PsY2jWjD.jpg
 
They are going to be milking varients of this chip for a few years im sure.

milk it, bilk it, or squeelk it....take your pic and base your RIO on that assumption......

capitalism 101 at its finest :D
 
This graph does not confirm anything.
Bravo, way to take my words out of context :ohwell:

even if total numbers don't apply, you can see that turing it does something better in one game than it does in others, that's what I meant but you took four words and made a fuss.


ridicules.
And who is that ridicules ? Sounds like a mentally challenged brother of Hercules.
 
Last edited:
If you look at that graph it tells quite a lot.

1. DLSS was never benched on Pascal.
2. The entire DLSS green bar is non existant performance on Pascal, rebenched on Pascal. Its like taking the steering wheel of a car and telling you it doesn't drive without an engine, but the complete car (Turing) does. Just like that sad 'Turing = 5X Pascal' statement when it comes to RT performance.
3. Conclusion: take the Shadows of the Tomb Raider bar for a realistic performance scenario of 1080 vs 2080. Give or take 30-35%. In other words you're better off upgrading to a 1080ti.

You can find more hints and confirmations of a 30 odd percent jump when you compare clocks and shader counts between 1080 and 2080 as well.

Thank me later ;)
 
Correction nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
@btarunr

You can still get a "fully enabled" 2080Ti by purchasing a QUADRO RTX6000

You would get
- no cut down chip, all enabled
- GPU RDMA (does it still make sense with nvlink and cpus like skylake xeon/ epyc which have separated SRIO)
- more vram and also ecc
- some more OpenGl extensions with custom high performance extensions (does it still make sense with Vulkan + RTX) ?
- 4 display ports (the best feature IMHO vs 3 displays ports + kinky hdmi)

My bet is there will be a Turing Titan or a Titan Turing which will be a good compromise from features most people will not need vs decent price
 
Nice read on Anandtech

Nv's arrogance is going to cost them a gen.

1535047904509.png
 
Correction nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
@btarunr

Indeed. These are TU102 specs, or "Big Turing" from which they cut the Quadro 8000, Quadro 6000 and both RTX 2080 TI and non-Ti cards. This also raises the question whether a couple of $699 RTX 2080 cards with 8GB and a ($80) NVLink will murder the Quadro RTX 5000, as they match it for memory but deliver a lot more cores (CUDA, RT or Tensor, your pick).
 
It should do async really well.Steve from gamers nexus said it's very asynchronous in its nature, and this graph confirms it. Only game that hits over 1.5x performance is one with async (wolfenstein, also disregard the two last ones with HDR, this is more like regained performance that pascal lost in HDR)

PduCp2U2PsY2jWjD.jpg
The graph mean nothing when there are no set up reference on them.
I probably missed it... Don't remembered NV mentioning Async Computing...

Oh well, just have to wait a little longer, I guess.
 
If you look at that graph it tells quite a lot.

1. DLSS was never benched on Pascal.
2. The entire DLSS green bar is non existant performance on Pascal, rebenched on Pascal.

Yeah that's baffling. Is DLSS going to be available on Pascal ? If so, what the hell is the point of all that die space occupied by tensor cores ? I mean, with all of those 110 Tflops I would expect something like DLSS to simply not be feasible on Pascal but it turns out it can do it just fine, albeit slower.
 
Full potential of raytracing unleashed:
9X1EnzB.jpg
 
Full potential of raytracing unleashed:
9X1EnzB.jpg

Impressive! RTX on is photo realistic.

Yeah, we have been here before, big whoop.
 
This graph does not confirm anything.
Not the exact settings used, not the locations in game.
It is extremely vague and should be disregarded.

The "X2 1080" claim in the title is ridicules.

Uh... duh. It does sort of show his point though. It does fine at async.

And who is that ridicules ? Sounds like a mentally challenged brother of Hercules.

And not going to lie, I lol'd

Is DLSS going to be available on Pascal

I picture in software it will be, yes. It will perform like shit, but make you want a Turing.
 
Last edited:
If you look at that graph it tells quite a lot.

1. DLSS was never benched on Pascal.
2. The entire DLSS green bar is non existant performance on Pascal, rebenched on Pascal. Its like taking the steering wheel of a car and telling you it doesn't drive without an engine, but the complete car (Turing) does. Just like that sad 'Turing = 5X Pascal' statement when it comes to RT performance.
3. Conclusion: take the Shadows of the Tomb Raider bar for a realistic performance scenario of 1080 vs 2080. Give or take 30-35%. In other words you're better off upgrading to a 1080ti.

You can find more hints and confirmations of a 30 odd percent jump when you compare clocks and shader counts between 1080 and 2080 as well.

Thank me later ;)

This exactly, looks like they theoretically calculated the performance a 1080 would have with DLSS on those games, and then compared it to 2080, and showing x2 only because with DLSS enabled the 1080 drops so much making the 2080 x2 faster. Something like that reminds me of when AMD showed their ryzen vs an i7 in a dota 2 while streaming with OBS, showing the i7 stuttering like hell only because they cranked up the settings on OBS, so a cherry picked scenario.
 
Yeah that's baffling. Is DLSS going to be available on Pascal ? If so, what the hell is the point of all that die space occupied by tensor cores ? I mean, with all of those 110 Tflops I would expect something like DLSS to simply not be feasible on Pascal but it turns out it can do it just fine, albeit slower.

No, DLSS needs tensor cores to work. You mind wan't to look about nvidia ngx, which DLSS belongs:

NVIDIA NGX is a new deep learning powered technology stack bringing AI-based features that accelerate and enhance graphics, photos imaging and video processing directly into applications. NVIDIA NGX features utilize Tensor Cores to maximize the efficiency of their operation, and require an RTX-capable GPU. The NGX SDK makes it easy for developers to integrate AI features into their application with pre-trained networks.
 
Back
Top