Thursday, May 17th 2012

GK110 Packs 2880 CUDA Cores, 384-bit Memory Interface: Die-Shot

May 17th, 2012 05:08 Discuss (65 Comments)

With its competition checked thanks to good performance by its GK104 silicon, NVIDIA was bold enough to release die-shots of its GK110 silicon, which made its market entry as the Tesla K20 GPU-compute accelerator. This opened flood-gates of speculation surrounding minute details of the new chip, from various sources. We found one of these most plausible, by Beyond3D community member "fellix". The source of the image appears to have charted out component layout of the chip by some pattern recognition and educated guesswork.

It identifies the the 7.1 billion transistor GK110 silicon to have 15 streaming multiprocessors (SMX). A little earlier this week, sources close to NVIDIA confirmed the SMX count to TechPowerUp. NVIDIA revealed that the chip will retain the SMX design of GK104, in which each of these holds 192 CUDA cores. Going by that, GK110 has a total of 2880 cores. Blocks of SMX units surround a centrally-located command processor, along with six setup pipelines, and a portion holding the ROPs and memory controllers. There are a total of six GDDR5 PHYs, which could amount to a 384-bit wide memory interface. The chip talks to the rest of the system over PCI-Express 3.0.

Source: Beyond3D Forum

Add your own comment

65 Comments on GK110 Packs 2880 CUDA Cores, 384-bit Memory Interface: Die-Shot

#26

Benetanegia

SIGSEGVmaybe you should read the full story on fermi and kepler and previous nvidia gaming cards generations ;)

Elaborate. You make no sense. Fermi is both a compute and gaming chip and so is Kepler, GK110. There is no compute specific chip. You know that because of the presence of texture units, rops... ;)

I've read almost everything available about them, and GCN and previous AMD/Ati chips. So be clear about what you mean because you make zero sense.

#27

Prima.Vera

Common man, the chips are programmed different, for different tasks. In the past you could have reflash the bios with a similar one from the gaming card, but today is no longer possible because of big hardware differences between those.

#28

Xzibit

BenetanegiaElaborate. You make no sense. Fermi is both a compute and gaming chip and so is Kepler, GK110. There is no compute specific chip. You know that because of the presence of texture units, rops... ;)

I've read almost everything available about them, and GCN and previous AMD/Ati chips. So be clear about what you mean because you make zero sense.

Well whatever they are doing their stock is still slowing down since after the announcement of the 670 where its high was 13.2 (13.6 weekend closing high). Its at 12.7/12.8 now. There suppose to be at 16.0^ So if they have something it be a good thing to announce something to slow down the slide especially when they've been rolling out new products and their stock continues to slide.:wtf:

#29

phanbuey

Prima.VeraCommon man, the chips are programmed different, for different tasks. In the past you could have reflash the bios with a similar one from the gaming card, but today is no longer possible because of big hardware differences between those.

That is not why that is no longer possible. They use hardware, bios, and driver locks to prevent you from doing it, but the GPU is identical.

#30

qubit

Overclocked quantum bit

Shame about the 384-bit data bus. A 512-bit bus would have optimized* the design and given us an even amount of memory like on the GTX 680. Instead, the memory is gonna be lopsided like on the GTX 580. Crucially, the design ends up delivering less computing power overall when it's not optimized. Oh well, I guess building an optimized design wasn't within their transistor budget. :ohwell:

*Optimized is when the design of a base 2 (binary) digital circuit fills out the binary address range ie a power of 2. All the components within the chip should follow the power of 2 to do it properly, of course. For example, that would mean using 16 SMX units with 256 CUDA cores each, etc.

#31

largon

^Nonsense.

#32

qubit

Overclocked quantum bit

largon^Nonsense.

You have no idea about digital design. Care to elaborate. :rolleyes:

#33

erocker

qubitYou have no idea about digital design. Care to elaborate. :rolleyes:

More heat

More expensive

512 bit bus really isn't necessary at all, especially with QDR memory.

That being said, I want this card. Hopefully it comes as the 7 series and is priced right.

#34

qubit

Overclocked quantum bit

erockerMore heat

More expensive

512 bit bus really isn't necessary at all, especially with QDR memory.

That being said, I want this card. Hopefully it comes as the 7 series and is priced right.

I know the physical practicalities unfortunately limit them and that's why I mentioned the transistor budget. Heat and power are related limitations of course. Believe me, if nvidia could put out a power of 2 design, they would.

I wish I could show you what the difference would be, but I have no practical way of demonstrating it. I guess one way to look at it is to check out the really low end cards, as they're quite often optimized in this way, because they don't take a huge transistor budget and not a lot of heat and power, either, so they can afford to do this in a physical product.

#35

largon

It's a shame Intel, AMD nor nV have any idea about chip building. They all make such abominations so they must be clueless.

#36

D007

HustlerSigh...enough already with these bullshit uber high end nonsense cards that will only sell to basement dwelling nerds jerking off to a few benchmark scores.

Give us the $200 660Ti with 2GB Vram and low power draw, you know, a card that the majority of Pc gamers can actually afford to buy or sensible enough not to want OTT crap like a 690.

I love how easy it is for people to hate others, just because they work hard and earn money. Try it sometime, maybe you could get a 690 and not hate on people who do..
I do Autocad, bring on the cores I say..

#37

wolf

Performance Enthusiast

SIGSEGVmaybe you should read the full story on fermi and kepler and previous nvidia gaming cards generations ;)

you are saying this to one of the few members of this site who is turely versed on this subject... perhaps you should read the full story...

I definitely desire this card, but will most likely never own it :laugh: GTX570 can keep me going until the next gen consoles raise the bar I think

#38

TheoneandonlyMrK

imho Nv look to have leaned a little too much in the gfx direction and not enough on compute, given the future were/they are faceing thats probably not wise, this cards compute power is pwned by its render potential

and i think it telling that their K10 has two gpu's a clear cost disavantage(in manufacture) when all prior gens started with single gpu compute cards, this chip is simply worse then the last for this purpose, they threw amd and intel a bone in this dept imho.

#39

N3M3515

I don't care about this...
Release the kraken already!!.....err, i mean release the god damned GTX 660 Ti variant...
$200 and below :D

@Benetanegia: i thought the superb performance per watt of gk104 was because they crippled its compute performance?

D007maybe you could get a 690 and not hate on people who do..

I would prefer to spend my hard earned money on a GTX680 or a 1150Mhz saphire 7970, you know, all the prettyness in the world (GTX 690), won't save it from microstutter.

#40

qubit

Overclocked quantum bit

largonIt's a shame Intel, AMD nor nV have any idea about chip building. They all make such abominations so they must be clueless.

quit trolling - I never said anything of the kind. You're obviously clueless about these things.

I'm surprised you actually thanked him for that useless post, erocker.

#41

Xzibit

I like the who fact that people are googling the $4,000+ varient of this card when even their SLI system isnt worth that much.

Maybe wait until there is solid evidence of when Nvidia will even bother to turn this thing into a GeForce variant and how it will castrate it.

Obviously it wasnt possible hence they made GK104 and J H-H said if it was feasable to do he would do it but as it is now its not as he mention to investors.

#42

wolf

Performance Enthusiast

qubitquit trolling - I never said anything of the kind. You're obviously clueless about these things.

I'm surprised you actually thanked him for that useless post, erocker.

he's just having a crack, you know, taking the piss. I lol'd

I very much doubt he meant it in a serious way

#43

qubit

Overclocked quantum bit

wolfhe's just having a crack, you know, taking the piss. I lol'd

I very much doubt he meant it in a serious way

Hmmm... doesn't look it to me. I'll let him explain that to me. Anyway, never mind, it's not worth arguing about any further, especially if someone's being a plonker, lol.

Thing is, I did actually learn the basics of designing integrated digital circuits at uni many moons ago and they tought me that building them out to the full power of 2 always maximises the design and they explained exactly why. This principle remains true regardless of what process technology is used or how fancy and complicated the design is.

Unfortunately, the chip literally grows exponentially in size as you do this and the semiconductor companies like nvidia, amd and Intel know this all too well, so in a real-world device, one is always limited by things such as transistor budget, physical size, reducing clock speed (fan-out/fan-in) limitations) power and heat etc. Hence, you get these odd, lopsided designs. The 384-bit bus is just one manifestation of this necessary compromise. It's just a shame to see it, which was my point in my original post on this thread.

It's hard for me to explain why in words here the exact reasons why building an IC out to the power of 2 is optimal (and perhaps someone else can do it better) which is why I advised erocker to consider the small, low end graphics cards as an explanation because for those, the physical budget is there to build them out to the full power of 2.

#44

Steevo

As per my last post in the lat thread that disappeared........not that this is going to be a failure.

But a overclocked 670 with 192 less cuda cores and 4% faster base clock rate is only 1% faster than a stock 680.

Do the math, frequency counts more with this than cluster counts, they went the way they did to attain such great clocks to meet their performance needs. I see this being 70% of the speed of a 690, it just needs to be priced accordingly.

#45

qubit

Overclocked quantum bit

SteevoBut a overclocked 670 with 192 less cuda cores and 4% faster base clock rate is only 1% faster than a stock 680.

Do the math, frequency counts more with this than cluster counts, they went the way they did to attain such great clocks to meet their performance needs. I see this being 70% of the speed of a 690, it just needs to be priced accordingly.

The SMX clusters are twice as wide as on the GK104 among other things, so it looks like the card might eat the GTX 690 for breakfast. This AnandTech article has a nice writeup on it.

#46

TheoneandonlyMrK

qubitThe 384-bit bus is just one manifestation of this necessary compromise. It's just a shame to see it, which was my point in my original post on this thread.

:)ive not read the whole thread so fogive me if i have this completely wrong, but AMD's 7970 uses a 384 bit combination of memory busses ie a iommu bus of 128 bit plus 256bit rop mem bus

could the additional 128bits of nvidias bus not be iommu too, given they are adearing to the same pciex3 spec and afaik it calls for virtualized memory support, something they both claim as doable in this gen ,and nvidia anounced VGX which surely needs iommu support??.

#47

Jurassic1024

HustlerSigh...enough already with these bullshit uber high end nonsense cards that will only sell to basement dwelling nerds jerking off to a few benchmark scores.

Give us the $200 660Ti with 2GB Vram and low power draw, you know, a card that the majority of Pc gamers can actually afford to buy or sensible enough not to want OTT crap like a 690.

These are workstation GPU's (Tesla) that go for upwards of $4000usd, not for the nerds jerking off in basements you described.

#48

garyab33

I really hope 2 of these cards is SLI will be finally able to run Metro 2033 at 1980x1080 with everything maxed out (DoF & Tessellation ON) and Crysis 1 (16AA and 16AF), also everything set to highest in NV control panel, at stable 100-120 fps. Gtx 680 is a joke, it is slightly better than gtx 580 in these 2 games. Doesnt worth the wait and price for gtx 680.

#49

Lost Hatter

Any Skynet is born.

#50

Steevo

qubitThe SMX clusters are twice as wide as on the GK104 among other things, so it looks like the card might eat the GTX 690 for breakfast. This AnandTech article has a nice writeup on it.

Except games aren't double precision, and the fact is they are the only extra processing power listed in the article.

Will it be faster at compute tasks? Absolutely. Much in the same way ATI used to have multiple shaders though it will be harder to schedule for, much like what ATI had to use drivers to do setup on for years. I think it will be interesting to see what performance is with different CPU's.

Add your own comment

GK110 Packs 2880 CUDA Cores, 384-bit Memory Interface: Die-Shot

65 Comments on GK110 Packs 2880 CUDA Cores, 384-bit Memory Interface: Die-Shot

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

GK110 Packs 2880 CUDA Cores, 384-bit Memory Interface: Die-Shot

Related News

65 Comments on GK110 Packs 2880 CUDA Cores, 384-bit Memory Interface: Die-Shot

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts