Thursday, August 23rd 2018

NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

Aug 23rd, 2018 10:42 Discuss (65 Comments)

The GeForce RTX 2080 Ti is indeed based on an ASIC codenamed "TU102." NVIDIA was referring to this 775 mm² chip when talking about the 18.5 billion-transistor count in its keynote. The company also provided a breakdown of its various "cores," and a block-diagram. The GPU is still laid out like its predecessors, but each of the 72 streaming multiprocessors (SMs) packs RT cores and Tensor cores in addition to CUDA cores.

The TU102 features six GPCs (graphics processing clusters), which each pack 12 SMs. Each SM packs 64 CUDA cores, 8 Tensor cores, and 1 RT core. Each GPC packs six geometry units. The GPU also packs 288 TMUs and 96 ROPs. The TU102 supports a 384-bit wide GDDR6 memory bus, supporting 14 Gbps memory. There are also two NVLink channels, which NVIDIA plans to later launch as its next-generation multi-GPU technology.

Source: VideoCardz

Add your own comment

65 Comments on NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

T4C Fantasy

CPU & GPU DB Maintainer

Correction nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
@btarunr

agent_x007

I'm assuming : "Geometry units" are Polymorph Engines ?
2080 Ti = GTX 570 (from cut downs).

dj-electric

T4C FantasyCorrection nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti

Does the full uarch have 5120 Cuda cores?

T4C Fantasy

CPU & GPU DB Maintainer

dj-electricDoes the full uarch have 5120 Cuda cores?

4608

natr0n

They are going to be milking varients of this chip for a few years im sure.

DeOdView

Any inform on Async on Turing Arch?

mcraygsx

natr0nThey are going to be milking varients of this chip for a few years im sure.

We can still expect a Titan Series for a hefty price of around $3000.

T4C Fantasy

CPU & GPU DB Maintainer

Full specs with die picture
www.techpowerup.com/gpudb/3305/geforce-rtx-2080-ti

cucker tarlson

DeOdViewAny inform on Async on Turing Arch?

It should do async really well.Steve from gamers nexus said it's very asynchronous in its nature, and this graph confirms it. Only game that hits over 1.5x performance is one with async (wolfenstein, also disregard the two last ones with HDR, this is more like regained performance that pascal lost in HDR)

#10

dj-electric

cucker tarlsonand this graph confirms it

This graph does not confirm anything.
Not the exact settings used, not the locations in game.
It is extremely vague and should be disregarded.

The "X2 1080" claim in the title is ridicules.

#11

bonehead123

natr0nThey are going to be milking varients of this chip for a few years im sure.

milk it, bilk it, or squeelk it....take your pic and base your RIO on that assumption......

capitalism 101 at its finest :D

#12

cucker tarlson

dj-electricThis graph does not confirm anything.

Bravo, way to take my words out of context :ohwell:

even if total numbers don't apply, you can see that turing it does something better in one game than it does in others, that's what I meant but you took four words and made a fuss.

dj-electricridicules.

And who is that ridicules ? Sounds like a mentally challenged brother of Hercules.

#13

Vayra86

If you look at that graph it tells quite a lot.

1. DLSS was never benched on Pascal.
2. The entire DLSS green bar is non existant performance on Pascal, rebenched on Pascal. Its like taking the steering wheel of a car and telling you it doesn't drive without an engine, but the complete car (Turing) does. Just like that sad 'Turing = 5X Pascal' statement when it comes to RT performance.
3. Conclusion: take the Shadows of the Tomb Raider bar for a realistic performance scenario of 1080 vs 2080. Give or take 30-35%. In other words you're better off upgrading to a 1080ti.

You can find more hints and confirmations of a 30 odd percent jump when you compare clocks and shader counts between 1080 and 2080 as well.

Thank me later ;)

#14

Avlin

T4C FantasyCorrection nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
@btarunr

You can still get a "fully enabled" 2080Ti by purchasing a QUADRO RTX6000

You would get
- no cut down chip, all enabled
- GPU RDMA (does it still make sense with nvlink and cpus like skylake xeon/ epyc which have separated SRIO)
- more vram and also ecc
- some more OpenGl extensions with custom high performance extensions (does it still make sense with Vulkan + RTX) ?
- 4 display ports (the best feature IMHO vs 3 displays ports + kinky hdmi)

My bet is there will be a Turing Titan or a Titan Turing which will be a good compromise from features most people will not need vs decent price

#15

Vayra86

Nice read on Anandtech

Nv's arrogance is going to cost them a gen.

#16

LemmingOverlord

T4C FantasyCorrection nothing on the 2080ti is maxed out
4352 cuda
544 tensor
68 rt
272 tmus
88 rops
68 sm

All this is, is the full chip specs for the die. Not 2080 ti
@btarunr

Indeed. These are TU102 specs, or "Big Turing" from which they cut the Quadro 8000, Quadro 6000 and both RTX 2080 TI and non-Ti cards. This also raises the question whether a couple of $699 RTX 2080 cards with 8GB and a ($80) NVLink will murder the Quadro RTX 5000, as they match it for memory but deliver a lot more cores (CUDA, RT or Tensor, your pick).

#17

DeOdView

cucker tarlsonIt should do async really well.Steve from gamers nexus said it's very asynchronous in its nature, and this graph confirms it. Only game that hits over 1.5x performance is one with async (wolfenstein, also disregard the two last ones with HDR, this is more like regained performance that pascal lost in HDR)

The graph mean nothing when there are no set up reference on them.
I probably missed it... Don't remembered NV mentioning Async Computing...

Oh well, just have to wait a little longer, I guess.

#18

Vya Domus

Vayra86If you look at that graph it tells quite a lot.

1. DLSS was never benched on Pascal.
2. The entire DLSS green bar is non existant performance on Pascal, rebenched on Pascal.

Yeah that's baffling. Is DLSS going to be available on Pascal ? If so, what the hell is the point of all that die space occupied by tensor cores ? I mean, with all of those 110 Tflops I would expect something like DLSS to simply not be feasible on Pascal but it turns out it can do it just fine, albeit slower.

#19

krykry

Full potential of raytracing unleashed:

#20

Durvelle27

krykryFull potential of raytracing unleashed:

:roll:

#21

Fluffmeister

krykryFull potential of raytracing unleashed:

Impressive! RTX on is photo realistic.

Yeah, we have been here before, big whoop.

#22

R-T-B

dj-electricThis graph does not confirm anything.
Not the exact settings used, not the locations in game.
It is extremely vague and should be disregarded.

The "X2 1080" claim in the title is ridicules.

Uh... duh. It does sort of show his point though. It does fine at async.

cucker tarlsonAnd who is that ridicules ? Sounds like a mentally challenged brother of Hercules.

And not going to lie, I lol'd

Vya DomusIs DLSS going to be available on Pascal

I picture in software it will be, yes. It will perform like shit, but make you want a Turing.

#23

oxidized

Vayra86If you look at that graph it tells quite a lot.

1. DLSS was never benched on Pascal.
2. The entire DLSS green bar is non existant performance on Pascal, rebenched on Pascal. Its like taking the steering wheel of a car and telling you it doesn't drive without an engine, but the complete car (Turing) does. Just like that sad 'Turing = 5X Pascal' statement when it comes to RT performance.
3. Conclusion: take the Shadows of the Tomb Raider bar for a realistic performance scenario of 1080 vs 2080. Give or take 30-35%. In other words you're better off upgrading to a 1080ti.

You can find more hints and confirmations of a 30 odd percent jump when you compare clocks and shader counts between 1080 and 2080 as well.

Thank me later ;)

This exactly, looks like they theoretically calculated the performance a 1080 would have with DLSS on those games, and then compared it to 2080, and showing x2 only because with DLSS enabled the 1080 drops so much making the 2080 x2 faster. Something like that reminds me of when AMD showed their ryzen vs an i7 in a dota 2 while streaming with OBS, showing the i7 stuttering like hell only because they cranked up the settings on OBS, so a cherry picked scenario.

#24

jabbadap

Vya DomusYeah that's baffling. Is DLSS going to be available on Pascal ? If so, what the hell is the point of all that die space occupied by tensor cores ? I mean, with all of those 110 Tflops I would expect something like DLSS to simply not be feasible on Pascal but it turns out it can do it just fine, albeit slower.

No, DLSS needs tensor cores to work. You mind wan't to look about nvidia ngx, which DLSS belongs:

NVIDIA NGX is a new deep learning powered technology stack bringing AI-based features that accelerate and enhance graphics, photos imaging and video processing directly into applications. NVIDIA NGX features utilize Tensor Cores to maximize the efficiency of their operation, and require an RTX-capable GPU. The NGX SDK makes it easy for developers to integrate AI features into their application with pre-trained networks.

#25

Vayra86

krykryFull potential of raytracing unleashed:

Post of the day

Add your own comment

NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

65 Comments on NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

Related News

65 Comments on NVIDIA "TU102" RT Core and Tensor Core Counts Revealed

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts