Tuesday, July 7th 2020

NVIDIA GeForce RTX 3070 and RTX 3070 Ti Rumored Specifications Appear

NVIDIA is slowly preparing to launch its next-generation Ampere graphics cards for consumers after we got the A100 GPU for data-centric applications. The Ampere lineup is getting more and more leaks and speculations every day, so we can assume that the launch is near. In the most recent round of rumors, we have some new information about the GPU SKU and memory of the upcoming GeForce RTX 3070 and RTX 3070 Ti. Thanks to Twitter user kopite7kimi, who had multiple confirmed speculations in the past, we have information that GeForce RTX 3070 and RTX 3070 Ti use a GA104 GPU SKU, paired with GDDR6 memory. The cath is that the Ti version of GPU will feature a new GDDR6X memory, which has a higher speed and can reportedly go up to 21 Gbps.

The regular RTX 3070 is supposed to have 2944 CUDA cores on GA104-400 GPU die, while its bigger brother RTX 3070 Ti is designed with 3072 CUDA cores on GA104-300 die. Paired with new technologies that Ampere architecture brings, with a new GDDR6X memory, the GPUs are set to be very good performers. It is estimated that both of the cards would reach a memory bandwidth of 512 GB/s. So far that is all we have. NVIDIA is reportedly in Design Validation Test (DVT) phase with these cards and is preparing for mass production in August. Following those events is the official launch which should happen before the end of this year, with some speculations indicating that it is in September.
Sources: VideoCardz, TweakTown, kopite7kimi (Twitter)
Add your own comment

104 Comments on NVIDIA GeForce RTX 3070 and RTX 3070 Ti Rumored Specifications Appear

#51
BoboOOZ
Well Moore's Law is Dead had talked a few months ago about an engineering sample that was tested with 21GHz memory and clocked close to 2.5 GHz.
We'll just have to see which is the 3070 and which is the 3080, and that's what's all over the place in the latest rumors. But maybe that's because it's not decided yet if, as usual, it depends on the competition and the pricing is left to the last minute.
Posted on Reply
#54
ARF
EarthDog
Why is that good news... if this is true...

I want these in 4Q...
Well, Nvidia's CEO was right :)
The gamers hoping, wishing, and praying for a new generation of GeForce cards to arrive this week got some bad news from the company’s CEO during a Computex press briefing: The hardware won’t show up for a “long time.”
Nvidia CEO: No next-gen GeForce GPUs for a 'long time,' but G-Sync BFGDs are coming soon
www.pcworld.com/article/3278095/no-geforce-gpus-g-sync-bfgds-nvidia.html


At that time I said don't expect RTX 3000 before H2 2020, most likely H1 2021.
Posted on Reply
#55
RandallFlagg
medi01
Say, if stuff is to be released for CP2077, shouldn't RDNA2 cards be in full swing AIB production already? Yet, there have been no leaks so far.

Is it me or the gap between 3070Ti and 3080 is rather large?


This makes no sense. You get faster cycles (higher clocks) you don't miraculously get circuits that are capable of doing something in 1 cycle if it took 2.
Faster transistor switching is not the same as higher clocks. And faster transistor switching does translate into potentially faster completion of instructions.

I'm not going to get into this with you people, just use google, there are dozens if not hundreds of references to this.

You might start here :

www.eeeguide.com/transistor-switching-times/
Posted on Reply
#56
EarthDog
ARF
Well, Nvidia's CEO was right :)



Nvidia CEO: No next-gen GeForce GPUs for a 'long time,' but G-Sync BFGDs are coming soon
www.pcworld.com/article/3278095/no-geforce-gpus-g-sync-bfgds-nvidia.html


At that time I said don't expect RTX 3000 before H2 2020, most likely H1 2021.
Aims at another goal post........

Ok....... that tells us nothing. What does a long time mean? Stop guessing ARFy...lol
Posted on Reply
#57
ARF
EarthDog
What does a long time mean?
Years. The quote is from Summer 2018.
Posted on Reply
#58
Valantar
RandallFlagg
Faster transistor switching is not the same as higher clocks. And faster transistor switching does translate into potentially faster completion of instructions.

I'm not going to get into this with you people, just use google, there are dozens if not hundreds of references to this.

You might start here :

www.eeeguide.com/transistor-switching-times/
But it can still only switch once per clock cycle, no? So faster switching speeds would help drive up clock speeds (as the time needed for a transistor to complete a cycle is shortened), but otherwise not change anything as a shorter time won't help do anything without a signal to make it do something. No?
Posted on Reply
#59
EarthDog
ARF
Years. The quote is from Summer 2018.
lol, I didn't catch that was from 2018, lol. What did you intend to convey with that? Nothing that matters for today?
Posted on Reply
#60
Valantar
EarthDog
lol, I didn't catch that was from 2018, lol. What did you intend to convey with that? Nothing that matters for today?
I believe they are trying to say that the cards will launch at some point far from sumer 2018. I.e. a statement that without specific context and explanation could mean tomorrow or in ten years. Call me an optimist, but ersonally, I'm leaning towards it being a lot closer to tomorrow than ten years from now. More than two years is definitely a long time in the GPU world.
Posted on Reply
#61
EarthDog
Valantar
I believe they are trying to say that the cards will launch at some point far from sumer 2018. I.e. a statement that without specific context and explanation could mean tomorrow or in ten years. Call me an optimist, but ersonally, I'm leaning towards it being a lot closer to tomorrow than ten years from now. More than two years is definitely a long time in the GPU world.
lol, yeah no clue what ARF's point was with that, lol...

We're talking about now and release dates and an article from 2018 gets put up, lol.... I've got to log off forums today, sillyness is all around, lol.
Posted on Reply
#62
RandallFlagg
Valantar
But it can still only switch once per clock cycle, no? So faster switching speeds would help drive up clock speeds (as the time needed for a transistor to complete a cycle is shortened), but otherwise not change anything as a shorter time won't help do anything without a signal to make it do something. No?
I don't think you're understanding what is happening. There are some entire microcode instructions that complete in one clock. In some cases, more than 1 instruction completes in a single clock *on average* because multiple instructions are being decoded at once (multiple pipelines). I

The speed that happens all comes down to transistor switching.

Look at this picture of a NAND gate. The 2nd transistor needs a result from the first to get an output. Now consider, a single micrcode instruction can have thousands of these gates (and other items like registers - storage locations - etc). If you make those gates switch faster for a given power input, you get a result faster. OR you can get the same performance at lower power because the instructions are completing faster.

Now you can make a transistor switch faster by giving it more power to overcome the impedance. This is why, when overclocking, it's common to hit a point where you have to increase voltage. The transistors need the extra power to keep up with the higher clocks.

This type of improvement is why you'll see TSMC stating things like getting a 20% improvement in performance from one node to another. That's an ideal situation for marketing purposes but the performance improvements are there.



Posted on Reply
#63
Valantar
RandallFlagg
I don't think you're understanding what is happening. There are some entire microcode instructions that complete in one clock. In some cases, more than 1 instruction completes in a single clock *on average* because multiple instructions are being decoded at once (multiple pipelines). I

The speed that happens all comes down to transistor switching.

Look at this picture of a NAND gate. The 2nd transistor needs a result from the first to get an output. Now consider, a single micrcode instruction can have thousands of these gates (and other items like registers - storage locations - etc). If you make those gates switch faster for a given power input, you get a result faster. OR you can get the same performance at lower power because the instructions are completing faster.

Now you can make a transistor switch faster by giving it more power to overcome the impedance. This is why, when overclocking, it's common to hit a point where you have to increase voltage. The transistors need the extra power to keep up with the higher clocks.

This type of improvement is why you'll see TSMC stating things like getting a 20% improvement in performance from one node to another. That's an ideal situation for marketing purposes but the performance improvements are there.




But again, all of that comes down to increased clock speeds, both your description of speeding up instruction decoding and the performance increases cited by foundries. When TSMC is talking about a 20% performance increase for a new node, they are talking about a 20% clock speed increase at the same power draw, as that is the only (somewhat) architecture-independent metric possible.
Posted on Reply
#64
RandallFlagg
Valantar
But again, all of that comes down to increased clock speeds, both your description of speeding up instruction decoding and the performance increases cited by foundries. When TSMC is talking about a 20% performance increase for a new node, they are talking about a 20% clock speed increase at the same power draw, as that is the only (somewhat) architecture-independent metric possible.
No that is incorrect. You seem to think all instructions complete in one clock so everything is based on clock. They don't. Most instructions take multiple clocks and pass through tens if not hundreds of thousands of gates during that clock cycle, and typically *are not* complete in that clock cycle. If you make your gates switch quicker, you get the result faster, it is simple as that. You can do your own research, not going to waste more time here.
Posted on Reply
#65
mtcn77
EarthDog
Aims at another goal post........

Ok....... that tells us nothing. What does a long time mean? Stop guessing ARFy...lol
But, Nvidia doesn't ever tell you anything. That is what is missing.
Nvidia launches products.
They don't make industry progress. For instance, the same monitor interface data compression method would improve frame doubling pipelines had it been implemented in displays, however they don't develop for outside markets.
Posted on Reply
#66
ARF
mtcn77
But, Nvidia doesn't ever tell you anything. That is what is missing.
Nvidia launches products.
They don't make industry progress. For instance, the same monitor interface data compression method would improve frame doubling pipelines had it been implemented in displays, however they don't develop for outside markets.
Yup, things around Nvidia's architectures are strictly hidden for the outside world to the developers and the drivers do the whole job.
All around Nvidia is closed and locked.
Posted on Reply
#67
EarthDog
mtcn77
But, Nvidia doesn't ever tell you anything. That is what is missing.
Nvidia launches products.
They don't make industry progress. For instance, the same monitor interface data compression method would improve frame doubling pipelines had it been implemented in displays, however they don't develop for outside markets.
... I think I missed your point? We all know it's more of a closed ecosystem... but that has nothing to do with this discussion (at least what I'm talking about).

I'm simply wondering why the hell a 2 y.o article was used to..... I dont even know why tf it posted........and now here we are discussion whatever point has nothing to do with what i said... man I love TPU......... :ohwell:
Posted on Reply
#68
ARF
EarthDog
... I think I missed your point? We all know it's more of a closed ecosystem... but that has nothing to do with this discussion (at least what I'm talking about).

I'm simply wondering why the hell a 2 y.o article was used to..... I dont even know why tf it posted........and now here we are discussion whatever point has nothing to do with what i said... man I love TPU......... :ohwell:
lol You said you want something in Q4, I told you to wait a bit longer. :D
Posted on Reply
#69
EarthDog
ARF
lol You said you want something in Q4, I told you to wait a bit longer. :D
Wow... useless is as useless does.

What a waste of life trying to figure that out...
Posted on Reply
#70
mtcn77
EarthDog
I think I missed your point?
You missed the point on how industry's inflection point is Nvidia. Nvidia competes with monitor scaler manufacturers. There is no cooperation between them.
Think of it this way: LCD beats OLED in every manner apart from pixel transitions. That is what is important about the convention. At the advent of the VVC codec, this could tap into vrr methods. Lcds overdrive better if they get multiple frame signals. It is due to liquid crystal alignment, they get jumbled up if voltage applied is direct current.
Posted on Reply
#71
EarthDog
mtcn77
You missed the point on how industry's inflection point is Nvidia. Nvidia competes with monitor scaler manufacturers. There is no cooperation between them.
Think of it this way: LCD beats OLED in every manner apart from pixel transitions. That is what is important about the convention. At the advent of the VVC codec, this could tap into vrr methods. Lcds overdrive better if they get multiple frame signals. It is due to liquid crystal alignment, they get jumbled up if voltage applied is direct current.


I wasn't aiming at that goal post either. And even when I pointed at the right goal post... we still start talking hockey sticks.

Anyway, thanks gentlemen for the information. I apologize if it was just me not getting it... but I've read through this multiple times and can't make the connection. Really... this was about the thread title and then I mentioned I wanted the cards in 4Q and then a post from 2018 like that was going to help...

... then some shiza about NV monitor scaling and other things......?????????!!!!!!!!!??????????
Posted on Reply
#72
Valantar
RandallFlagg
No that is incorrect. You seem to think all instructions complete in one clock so everything is based on clock. They don't. Most instructions take multiple clocks and pass through tens if not hundreds of thousands of gates during that clock cycle, and typically *are not* complete in that clock cycle. If you make your gates switch quicker, you get the result faster, it is simple as that. You can do your own research, not going to waste more time here.
I didn't say instructions complete in a single cycle, just that I would assume that any increase in transistor switching speed is typically absorbed into the margins needed for increased clocks, meaning there is little room left for further utilizing this to lower the amount of cycles needed to finish an instruction. Some, sure, but a few percent isn't enough to allow you to finish in one cycle rather than two unless you were already very, very close to that target or you redesign the hardware to reach this goal, in which case one could argue that the increase in switching speed is less important than the redesign (though it obviously lowers the bar).
Posted on Reply
#73
RandallFlagg
Valantar
I didn't say instructions complete in a single cycle, just that I would assume that any increase in transistor switching speed is typically absorbed into the margins needed for increased clocks, meaning there is little room left for further utilizing this to lower the amount of cycles needed to finish an instruction. Some, sure, but a few percent isn't enough to allow you to finish in one cycle rather than two unless you were already very, very close to that target or you redesign the hardware to reach this goal, in which case one could argue that the increase in switching speed is less important than the redesign (though it obviously lowers the bar).
Uh no, that is *NOT* what you said........ You are now backtracking.

What you said was (emphasis added) :
Valantar
But again, all of that comes down to increased clock speeds, both your description of speeding up instruction decoding and the performance increases cited by foundries. When TSMC is talking about a 20% performance increase for a new node, they are talking about a 20% clock speed increase at the same power draw, as that is the only (somewhat) architecture-independent metric possible.
Clock speed has nothing to do with IPC increase from new nodes.

If I have a microcode instruction that completes in 1.1 cycles, it will have to wait for the next (2nd) cycle for anything to be done with the result. This essentially means it takes 2 cycles to complete in a useful way. If I improve the time it takes to complete that instruction by 20% (a common claim by TSMC) it now takes ~0.9 cycles, it now went from being a 2 cycle instruction to a one cycle instruction. This *directly* impacts IPC.

You are welcome for the education.
Posted on Reply
#74
John Naylor
Assimilator
The cath?
Lost a c methinks ... catch
Posted on Reply
#75
mtcn77
EarthDog
... then some shiza about NV monitor scaling and other things......?????????!!!!!!!!!??????????
That must have left you a little disheveled.
Posted on Reply
Add your own comment