• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA GM204 and GM206 to Tape-Out in April, Products to Launch in Q4?

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,668 (7.43/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
It looks like things are going horribly wrong at TSMC, NVIDIA and AMD's principal foundry partner, with its 20 nm manufacturing process, which is throwing a wrench into the works at NVIDIA, forcing it to re-engineer an entire lineup of "Maxwell" GPUs based on existing 28 nm process. Either that, or NVIDIA is confident of delivering an efficiency leap using Maxwell on existing/mature 28 nm process, and saving costs in the process. NVIDIA is probably drawing comfort from the excellent energy-efficiency demonstrated by its Maxwell-based GeForce GTX 750 series. According to a 3DCenter.org report, NVIDIA's next mainline GPUs, the GM204 and GM206, which will be built on the 28 nm process, and "Maxwell" architecture, will tape out later this month. Products based on the two, however, can't be expected before Q4 2014, as late as December, or even as late as January 2015.

GM204 succeeds GK104 as the company's next workhorse performance-segment silicon, which could power graphics card SKUs ranging all the way from US $250 to $500. An older report suggests that it could feature as many as 3,200 CUDA cores. The GM204 could be taped out in April 2014, and the first GeForce products based on it could launch no sooner than December 2014. The GM206 is the company's next mid-range silicon, which succeeds GK106. It will tape out in April, alongside the GM204, but products based on it will launch only in January 2015. The GM200 is a different beast altogether. There's no mention of which process the chip will be based on, but it will succeed the GK110, and should offer performance increments worthy of being a successor. For that, it has to be based on the 20 nm process. It will tape-out in June 2014, and products based on it will launch only in or after Q2 2015.

View at TechPowerUp Main Site
 
28nm I am not buying.
 
Would this then explain the specs for the "880" not being as impressive as expected?
 
If this is true then it sucks. Q4 2014/Q1 2015 is way too long. I was expecting Maxwell to be released at least this summer.
 
waaaat ??????

what's going to be released this year ? This article is not making sense -_-

and TSMc can go *** themselves
NVIDIA should look for a new partner because this is ridiculous :D
 
Or it could be a conspiracy to extend the product line on both sides.
the gtx750 750 ti has been a success so im not sure where your getting this idea about aconspiracy
 
My numbers;
  • 28 nm GM104 silicon
  • ~7 billion transistors
  • 3,072 CUDA cores
  • 192 TMUs
  • 48 ROPs
  • 6.1 single-precision TFLOP/s - 2.8 double-precision TFLOP/s
  • 384-bit wide GDDR5 memory interface
  • 6 GB standard memory amount
  • 384 GB/s memory bandwidth
  • Clock speeds of 900 MHz core, 1000 MHz GPU Boost, 8 GHz memory
  • 250W board power

  • 28 nm GM106 silicon
  • ~5 billion transistors
  • 1,792 CUDA cores
  • 112 TMUs
  • 32 ROPs
  • 3.9 single-precision TFLOP/s - 0.9 double-precision TFLOP/s
  • 256-bit wide GDDR5 memory interface
  • 4 GB standard memory amount
  • 224 GB/s memory bandwidth
  • Clock speeds of 1000 MHz core, 1100 MHz GPU Boost, 7 GHz memory
  • 150W board power
 
My numbers;
  • 28 nm GM104 silicon
  • ~7 billion transistors
  • 3,072 CUDA cores
  • 192 TMUs
  • 48 ROPs
  • 6.1 single-precision TFLOP/s - 2.8 double-precision TFLOP/s
  • 384-bit wide GDDR5 memory interface
  • 6 GB standard memory amount
  • 384 GB/s memory bandwidth
  • Clock speeds of 900 MHz core, 1000 MHz GPU Boost, 8 GHz memory
  • 250W board power
  • 28 nm GM106 silicon
  • ~5 billion transistors
  • 1,792 CUDA cores
  • 112 TMUs
  • 32 ROPs
  • 3.9 single-precision TFLOP/s - 0.9 double-precision TFLOP/s
  • 256-bit wide GDDR5 memory interface
  • 4 GB standard memory amount
  • 224 GB/s memory bandwidth
  • Clock speeds of 1000 MHz core, 1100 MHz GPU Boost, 7 GHz memory
  • 150W board power
150 w and 250 w ?
please go look agian at the 750 ti
and comment back with the correct results :)
 
150 w and 250 w ?
please go look agian at the 750 ti
and comment back with the correct results :)
GK106 -> 140 Watts
GM107 -> 60 Watts

GK104 -> 230 Watts
GM106 -> 150 Watts

GK110 = GM104
 
My numbers;
  • 28 nm GM104 silicon
  • ~7 billion transistors
  • 3,072 CUDA cores
  • 192 TMUs
  • 48 ROPs
  • 6.1 single-precision TFLOP/s - 2.8 double-precision TFLOP/s
  • 384-bit wide GDDR5 memory interface
  • 6 GB standard memory amount
  • 384 GB/s memory bandwidth
  • Clock speeds of 900 MHz core, 1000 MHz GPU Boost, 8 GHz memory
  • 250W board power
Not sure how you arrived at that calculation. Highly unlikely that Nvidia would offer a 1:2 rate for FP64 on the GM204 any more than it did with GK104 and GF114/104 before it. Double precision is 1. Unneeded for the gaming segment, 2. Adds to the power budget, and 3. Adds die space.
If the GM204 is an analogue of the previous 104 boards then FP64 will be culled. It was 1:12 in the GF104/104, and 1:24 in GK104. Keeping the FP64 ability at a nominal level would also protect Nvidia's margins on existing Titan/K6000/K20/K40 product lines- and more appropriately, keep them relevant since there's no way Nvidia make a GK 110 replacement on 28nm - which means holding out for the 16nm FinFET node (20nm BEOL+16nm FEOL) for a successor.
 
Not sure how you arrived at that calculation. Highly unlikely that Nvidia would offer a 1:2 rate for FP64 on the GM204 any more than it did with GK104 and GF114/104 before it. Double precision is 1. Unneeded for the gaming segment, 2. Adds to the power budget, and 3. Adds die space.
If the GM204 is an analogue of the previous 104 boards then FP64 will be culled. It was 1:12 in the GF104/104, and 1:24 in GK104.
GM107 => 1/8th
GM106 => 1/4th
GM104 => 1/2th
GM200 => Full DP.

The future is compute shading which will be reliant on 64-bit maths.
 
28nm I am not buying.

Does the process node matter if the card delivers good performance and power efficiency ?
 
GM107 => 1/8th
GM106 => 1/4th
GM104 => 1/2th
GM200 => Full DP.

The future is compute shading which will be reliant on 64-bit maths.

You are pulling so much of this out of your ass. Unless you have some insider info.
 
Better tape out soon if it's going to be this month.
9 days left lol then we have to wait until next year for mid-range :(
 
GM107 => 1/8th
GM106 => 1/4th
GM104 => 1/2th
GM200 => Full DP.
The future is compute shading which will be reliant on 64-bit maths.
Really? I always thought that compute shading tended to only use FP64 for professional simulations and the like. Gaming compute - ambient occlusion, global illumination, motion blur, particle/water/smoke/fog effects, and depth of field etc. were almost entirely single precision based. If they were double precision based then wouldn't it stand to reason (as an example) that a R9 290X's (704 GFlops FP64) ability at applying compute shader image quality options would make it markedly inferior to the HD 7970 (1075 GFlops) ?
9 days left lol then we have to wait until next year for mid-range :(
FWIW, the original forum post this article is based on is dated 15th April.
 
This happened not long ago where we were stuck on a node for awhile and it sucked for us, but with the efficiency of Maxwell might make up for it. The thing that really sucks is this Q4 nonsense.
 
Really? I always thought that compute shading tended to only use FP64 for professional simulations and the like. Gaming compute - ambient occlusion, global illumination, motion blur, particle/water/smoke/fog effects, and depth of field etc. were almost entirely single precision based. If they were double precision based then wouldn't it stand to reason (as an example) that a R9 290X's (704 GFlops FP64) ability at applying compute shader image quality options would make it markedly inferior to the HD 7970 (1075 GFlops) ?

FWIW, the original forum post this article is based on is dated 15th April.
April 15th? for wat the tape out or release ? xD
 
April 15th? for wat the tape out or release ? xD
15th April is the date of the original post (16th April local time- my time zone is 10 hours ahead of Germany) stating tape out this month.
So, if the tape out hadn't happened at that stage, it left 15 days in the month for it to happen at that stage....assuming tape out hadn't already occurred- then you're in the realms of trying to disprove a negative.
 
I know this could probably never happen but wouldn't it be amazing if either nVidia or AMD (even less likely) signed up to used Intel's foundries giving us Maxwell or Pirate Islands at a pretty much ready 14nm!

It actually makes a lot of sense for all parties, Intel needs more of a reason than its own chips to really push forward with 14nm and for nVidia and AMD its a highly advanced and relatively mature/tested process.

Win, freakin win baby!
 
Last edited:
I know this could probably never happen but wouldn't it be amazing if either nVidia or AMD (even less likely) signed up to used Intel's foundries giving us Maxwell or Pirate Islands at a pretty much ready 14nm!

It actually makes a lot of sense for all parties, Intel needs more of a reason than its own chips to really push forward with 14nm and for nVidia and AMD its a highly advanced and relatively mature/tested process.

Win, freakin win baby!

That will never happen. Hell, Intel barely can get their 14nm process running correctly, and they have the best Engineers in the industry. Not to mention Intel has literally nothing to gain from openning their top of the line fabs to competitors. Business aside, you can't just take a microprocessor design and slap it on a process node that's 30% smaller, it doesn't work like that. They would have to spend a few monthes redesigning and testing it to ensure it's functioning correctly, efficient, and cost effective.
 
I know this could probably never happen but wouldn't it be amazing if either nVidia or AMD (even less likely) signed up to used Intel's foundries giving us Maxwell or Pirate Islands at a pretty much ready 14nm!

It actually makes a lot of sense for all parties, Intel needs more of a reason than its own chips to really push forward with 14nm and for nVidia and AMD its a highly advanced and relatively mature/tested process.

Win, freakin win baby!


>implying CPUs and GPUs use the same kind of process

The transistors / chips for CPUs and for GPUs are done in a different way to cater to the ways each of these kinds of ICs work.

As a very good example to illustrate this, IF You know / remember, this was a major hindrance for AMD when they made their latest APUs to keep the GPU part good – using a process meant for CPUs would have non-trivially harmed the performance of the GPU side and vice versa. So they had to compromise. Which is also the reason why the CPU part on their latest APUs don't OC as good any more, compared to their previous APUs.
So yeah, using Intel's fabs for those GPUs could mean actually worse performance and power efficiency despite being 14nm.
 
Back
Top