Thursday, August 23rd 2018

NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

The GeForce RTX 2080 Ti is indeed based on an ASIC codenamed "TU102." NVIDIA was referring to this 775 mm² chip when talking about the 18.5 billion-transistor count in its keynote. The company also provided a breakdown of its various "cores," and a block-diagram. The GPU is still laid out like its predecessors, but each of the 72 streaming multiprocessors (SMs) packs RT cores and Tensor cores in addition to CUDA cores.

The TU102 features six GPCs (graphics processing clusters), which each pack 12 SMs. Each SM packs 64 CUDA cores, 8 Tensor cores, and 1 RT core. Each GPC packs six geometry units. The GPU also packs 288 TMUs and 96 ROPs. The TU102 supports a 384-bit wide GDDR6 memory bus, supporting 14 Gbps memory. There are also two NVLink channels, which NVIDIA plans to later launch as its next-generation multi-GPU technology.
Source: VideoCardz
Add your own comment

65 Comments on NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

#1
T4C Fantasy
CPU & GPU DB Maintainer
Correction nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
@btarunr
Posted on Reply
#2
agent_x007
I'm assuming : "Geometry units" are Polymorph Engines ?
2080 Ti = GTX 570 (from cut downs).
Posted on Reply
#3
dj-electric
T4C FantasyCorrection nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
Does the full uarch have 5120 Cuda cores?
Posted on Reply
#4
T4C Fantasy
CPU & GPU DB Maintainer
dj-electricDoes the full uarch have 5120 Cuda cores?
4608
Posted on Reply
#5
natr0n
They are going to be milking varients of this chip for a few years im sure.
Posted on Reply
#6
DeOdView
Any inform on Async on Turing Arch?
Posted on Reply
#7
mcraygsx
natr0nThey are going to be milking varients of this chip for a few years im sure.
We can still expect a Titan Series for a hefty price of around $3000.
Posted on Reply
#9
cucker tarlson
DeOdViewAny inform on Async on Turing Arch?
It should do async really well.Steve from gamers nexus said it's very asynchronous in its nature, and this graph confirms it. Only game that hits over 1.5x performance is one with async (wolfenstein, also disregard the two last ones with HDR, this is more like regained performance that pascal lost in HDR)

Posted on Reply
#10
dj-electric
cucker tarlsonand this graph confirms it
This graph does not confirm anything.
Not the exact settings used, not the locations in game.
It is extremely vague and should be disregarded.

The "X2 1080" claim in the title is ridicules.
Posted on Reply
#11
bonehead123
natr0nThey are going to be milking varients of this chip for a few years im sure.
milk it, bilk it, or squeelk it....take your pic and base your RIO on that assumption......

capitalism 101 at its finest :D
Posted on Reply
#12
cucker tarlson
dj-electricThis graph does not confirm anything.
Bravo, way to take my words out of context :ohwell:

even if total numbers don't apply, you can see that turing it does something better in one game than it does in others, that's what I meant but you took four words and made a fuss.
dj-electricridicules.
And who is that ridicules ? Sounds like a mentally challenged brother of Hercules.
Posted on Reply
#13
Vayra86
If you look at that graph it tells quite a lot.

1. DLSS was never benched on Pascal.
2. The entire DLSS green bar is non existant performance on Pascal, rebenched on Pascal. Its like taking the steering wheel of a car and telling you it doesn't drive without an engine, but the complete car (Turing) does. Just like that sad 'Turing = 5X Pascal' statement when it comes to RT performance.
3. Conclusion: take the Shadows of the Tomb Raider bar for a realistic performance scenario of 1080 vs 2080. Give or take 30-35%. In other words you're better off upgrading to a 1080ti.

You can find more hints and confirmations of a 30 odd percent jump when you compare clocks and shader counts between 1080 and 2080 as well.

Thank me later ;)
Posted on Reply
#14
Avlin
T4C FantasyCorrection nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
@btarunr
You can still get a "fully enabled" 2080Ti by purchasing a QUADRO RTX6000

You would get
- no cut down chip, all enabled
- GPU RDMA (does it still make sense with nvlink and cpus like skylake xeon/ epyc which have separated SRIO)
- more vram and also ecc
- some more OpenGl extensions with custom high performance extensions (does it still make sense with Vulkan + RTX) ?
- 4 display ports (the best feature IMHO vs 3 displays ports + kinky hdmi)

My bet is there will be a Turing Titan or a Titan Turing which will be a good compromise from features most people will not need vs decent price
Posted on Reply
#15
Vayra86
Nice read on Anandtech

Nv's arrogance is going to cost them a gen.

Posted on Reply
#16
LemmingOverlord
T4C FantasyCorrection nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
@btarunr
Indeed. These are TU102 specs, or "Big Turing" from which they cut the Quadro 8000, Quadro 6000 and both RTX 2080 TI and non-Ti cards. This also raises the question whether a couple of $699 RTX 2080 cards with 8GB and a ($80) NVLink will murder the Quadro RTX 5000, as they match it for memory but deliver a lot more cores (CUDA, RT or Tensor, your pick).
Posted on Reply
#17
DeOdView
cucker tarlsonIt should do async really well.Steve from gamers nexus said it's very asynchronous in its nature, and this graph confirms it. Only game that hits over 1.5x performance is one with async (wolfenstein, also disregard the two last ones with HDR, this is more like regained performance that pascal lost in HDR)

The graph mean nothing when there are no set up reference on them.
I probably missed it... Don't remembered NV mentioning Async Computing...

Oh well, just have to wait a little longer, I guess.
Posted on Reply
#18
Vya Domus
Vayra86If you look at that graph it tells quite a lot.

1. DLSS was never benched on Pascal.
2. The entire DLSS green bar is non existant performance on Pascal, rebenched on Pascal.
Yeah that's baffling. Is DLSS going to be available on Pascal ? If so, what the hell is the point of all that die space occupied by tensor cores ? I mean, with all of those 110 Tflops I would expect something like DLSS to simply not be feasible on Pascal but it turns out it can do it just fine, albeit slower.
Posted on Reply
#19
krykry
Full potential of raytracing unleashed:
Posted on Reply
#20
Durvelle27
krykryFull potential of raytracing unleashed:
:roll:
Posted on Reply
#21
Fluffmeister
krykryFull potential of raytracing unleashed:
Impressive! RTX on is photo realistic.

Yeah, we have been here before, big whoop.
Posted on Reply
#22
R-T-B
dj-electricThis graph does not confirm anything.
Not the exact settings used, not the locations in game.
It is extremely vague and should be disregarded.

The "X2 1080" claim in the title is ridicules.
Uh... duh. It does sort of show his point though. It does fine at async.
cucker tarlsonAnd who is that ridicules ? Sounds like a mentally challenged brother of Hercules.
And not going to lie, I lol'd
Vya DomusIs DLSS going to be available on Pascal
I picture in software it will be, yes. It will perform like shit, but make you want a Turing.
Posted on Reply
#23
oxidized
Vayra86If you look at that graph it tells quite a lot.

1. DLSS was never benched on Pascal.
2. The entire DLSS green bar is non existant performance on Pascal, rebenched on Pascal. Its like taking the steering wheel of a car and telling you it doesn't drive without an engine, but the complete car (Turing) does. Just like that sad 'Turing = 5X Pascal' statement when it comes to RT performance.
3. Conclusion: take the Shadows of the Tomb Raider bar for a realistic performance scenario of 1080 vs 2080. Give or take 30-35%. In other words you're better off upgrading to a 1080ti.

You can find more hints and confirmations of a 30 odd percent jump when you compare clocks and shader counts between 1080 and 2080 as well.

Thank me later ;)
This exactly, looks like they theoretically calculated the performance a 1080 would have with DLSS on those games, and then compared it to 2080, and showing x2 only because with DLSS enabled the 1080 drops so much making the 2080 x2 faster. Something like that reminds me of when AMD showed their ryzen vs an i7 in a dota 2 while streaming with OBS, showing the i7 stuttering like hell only because they cranked up the settings on OBS, so a cherry picked scenario.
Posted on Reply
#24
jabbadap
Vya DomusYeah that's baffling. Is DLSS going to be available on Pascal ? If so, what the hell is the point of all that die space occupied by tensor cores ? I mean, with all of those 110 Tflops I would expect something like DLSS to simply not be feasible on Pascal but it turns out it can do it just fine, albeit slower.
No, DLSS needs tensor cores to work. You mind wan't to look about nvidia ngx, which DLSS belongs:
NVIDIA NGX is a new deep learning powered technology stack bringing AI-based features that accelerate and enhance graphics, photos imaging and video processing directly into applications. NVIDIA NGX features utilize Tensor Cores to maximize the efficiency of their operation, and require an RTX-capable GPU. The NGX SDK makes it easy for developers to integrate AI features into their application with pre-trained networks.
Posted on Reply
#25
Vayra86
krykryFull potential of raytracing unleashed:
Post of the day
Posted on Reply
Add your own comment
Apr 25th, 2024 09:08 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts