Tuesday, February 28th 2017

NVIDIA Announces the GeForce GTX 1080 Ti Graphics Card at $699

NVIDIA today unveiled the GeForce GTX 1080 Ti graphics card, its fastest consumer graphics card based on the "Pascal" GPU architecture, and which is positioned to be more affordable than the flagship TITAN X Pascal, at USD $699, with market availability from the first week of March, 2017. Based on the same "GP102" silicon as the TITAN X Pascal, the GTX 1080 Ti is slightly cut-down. While it features the same 3,584 CUDA cores as the TITAN X Pascal, the memory amount is now lower, at 11 GB, over a slightly narrower 352-bit wide GDDR5X memory interface. This translates to 11 memory chips on the card. On the bright side, NVIDIA is using newer memory chips than the one it deployed on the TITAN X Pascal, which run at 11 GHz (GDDR5X-effective), so the memory bandwidth is 484 GB/s.

Besides the narrower 352-bit memory bus, the ROP count is lowered to 88 (from 96 on the TITAN X Pascal), while the TMU count is unchanged from 224. The GPU core is clocked at a boost frequency of up to 1.60 GHz, with the ability to overclock beyond the 2.00 GHz mark. It gets better: the GTX 1080 Ti features certain memory advancements not found on other "Pascal" based graphics cards: a newer memory chip and optimized memory interface, that's running at 11 Gbps. NVIDIA's Tiled Rendering Technology has also been finally announced publicly; a feature NVIDIA has been hiding from its consumers since the GeForce "Maxwell" architecture, it is one of the secret sauces that enable NVIDIA's lead.
The Tiled Rendering technology brings about huge improvements in memory bandwidth utilization by optimizing the render process to work in square sized chunks, instead of drawing the whole polygon. Thus, geometry and textures of a processed object stays on-chip (in the L2 cache), which reduces cache misses and memory bandwidth requirements.
Together with its lossless memory compression tech, NVIDIA expects Tiled Rendering, and its storage tech, Tiled Caching, to more than double, or even close to triple, the effective memory bandwidth of the GTX 1080 Ti, over its physical bandwidth of 484 GB/s.
NVIDIA is making sure it doesn't run into the thermal and electrical issues of previous-generation reference design high-end graphics cards, by deploying a new 7-phase dual-FET VRM that reduces loads (and thereby temperatures) per MOSFET. The underlying cooling solution is also improved, with a new vapor-chamber plate, and a denser aluminium channel matrix.
Watt-to-Watt, the GTX 1080 Ti will hence be up to 2.5 dBA quieter than the GTX 1080, or up to 5°C cooler. The card draws power from a combination of 8-pin and 6-pin PCIe power connectors, with the GPU's TDP rated at 220W. The GeForce GTX 1080 Ti is designed to be anywhere between 20-45% faster than the GTX 1080 (35% on average).
The GeForce GTX 1080 Ti is widely expected to be faster than the TITAN X Pascal out of the box, despite is narrower memory bus and fewer ROPs. The higher boost clocks and 11 Gbps memory, make up for the performance deficit. What's more, the GTX 1080 Ti will be available in custom-design boards, and factory-overclocked speeds, so the GTX 1080 Ti will end up being the fastest consumer graphics option until there's competition.
Add your own comment

160 Comments on NVIDIA Announces the GeForce GTX 1080 Ti Graphics Card at $699

#126
Ascalaphus
This was the card I was waiting for.

Going to get 2 of these bad boys to replace my SLI 980Tis.
Posted on Reply
#127
W1zzard
NTM2003any Idea when pre order starts I keep refreshing the amazon page but nothing yet not even price drops
We wanted to let you know that the GTX 1080 Ti will officially be available for pre-orders starting at 8:00 a.m. PST tomorrow morning. For folks that want to get in the action, the pre-order link is: www.nvidia.com/en-us/geforce/products/10series/geforce-gtx-1080-ti/

Got this from NVIDIA earlier today.
Posted on Reply
#128
GhostRyder
efikkanIf 1080 Ti is boring (even though it's the most exciting high end model in recent history), then Vega is going to bore you to death.
How so? While its definitely a better deal its essentially a Titan XP without 1gb of ram and with better voltage control/Boost clocks. That's not completely interesting even if its a good price.
FluffmeisterIndeed, all credit to the AMD fanboys for their patience waiting for that mythical beast.
So your saying people waiting for a GPU release are dumb fanboys? Oh right its just the people who are waiting on an AMD card. Heaven forbid waiting and seeing before making an expensive purchase.
W1zzardWe wanted to let you know that the GTX 1080 Ti will officially be available for pre-orders starting at 8:00 a.m. PST tomorrow morning. For folks that want to get in the action, the pre-order link is: www.nvidia.com/en-us/geforce/products/10series/geforce-gtx-1080-ti/

Got this from NVIDIA earlier today.
Wow that fast, really considering off loading my Titan XP for a pair of these now!!!
Posted on Reply
#129
qubit
Overclocked quantum bit
newtekie1Just like so many great GPUs before it.
Ya. 11GB RAM just doesn't sit right though lol. Maybe I'll get one just for the pervesity of it...
Posted on Reply
#130
medi01
efikkanThis is by far the greatest top consumer model Nvidia has released in ages
Yeah. Ages.
Not counting the card it released in August 2016, then it is whopping 5 month. Exciting.
efikkan35% extra performance
"extra", eh?
efikkanMany of you dismiss this product, yet you create hype about Vega...
It's not hard to see the difference.
nVidia released rebranded Titan.
AMD is expected to release a brand new card.
efikkan...which will not even compete with this one.
This is simply fanboi-ism, we don't know yet, but pricing that Huang has opted for, hints at something rather close.
Posted on Reply
#131
newtekie1
Semi-Retired Folder
qubitYa. 11GB RAM just doesn't sit right though lol. Maybe I'll get one just for the pervesity of it...
Honestly, neither does 12GB. I'm still stuck on 2, 4, 8, 16, 32. But even system RAM isn't sticking to that anymore... I'm just too old fashioned.:cry:
Posted on Reply
#132
qubit
Overclocked quantum bit
newtekie1Honestly, neither does 12GB. I'm still stuck on 2, 4, 8, 16, 32. But even system RAM isn't sticking to that anymore... I'm just too old fashioned.:cry:
I know. When it's not a power of 2 it gets my OCD going like crazy, too. It creates such inefficiencies to make such designs. I know why they do it though, since chip sizes would otherwise grow exponentially which is unsustainable.
Posted on Reply
#133
EarthDog
Do tell why an 11GB setup is 'less efficient' than12GB...
Posted on Reply
#134
dalekdukesboy
AirThe graph is perfect. RPM value is meaningless, what matters is noise levels and cooling performance, both shown on the graph. But, as I said before, is not an apples to apples comparison because of the difference in die area and outlet design. So I'm not buying the "better cooler" claim.
Later point you make is fine, but I call BS on "graph is perfect"...you can fool some of the techpowerup people all the time, but not most of them most of the time as Honest Abe said; if the graph were so "perfect" no one would question what it said or hardly anyone would and myself included at least questioned the format which frankly, was bad and not complete.
Posted on Reply
#135
dalekdukesboy
efikkanI'm sorry, but that's incorrect.
Modern GPUs work by having multiple separate 32-bit memory controllers, each complete with their own ROPs. GTX 1080 Ti has one of these disabled, which is why it also has fewer ROPs. This is one of the nice modular features of modern GPUs.

When a cluster of cores wants to access a block of memory it addresses the respective memory controller.
What would you expect from someone who wants to stay in the EU? :)
Posted on Reply
#136
Air
dalekdukesboyLater point you make is fine, but I call BS on "graph is perfect"...you can fool some of the techpowerup people all the time, but not most of them most of the time as Honest Abe said; if the graph were so "perfect" no one would question what it said or hardly anyone would and myself included at least questioned the format which frankly, was bad and not complete.
How is it incomplete? It has 4 noise and temperature data points, both correctly labeled, with the correct units and correctly ploted. It gets the point across that you can get 5 °C lower or 3,5 db(A) quieter on the Ti at the same power. You could argue, maybe, that it has too much information for the average audience and should be simpler. Maybe with only a bar graph for a single data point.
Posted on Reply
#137
dalekdukesboy
Call it whatever you want, it showed causality of temps between the two cards fine but it's not the complexity of it, but more the left to right nature of how we read sentences is broken on this graph....it even has arrow pointing right to left. This may work for certain languages or people who go right to left but at least for English, Spanish, etc we are trained to go left to right.
Posted on Reply
#138
Air
dalekdukesboyCall it whatever you want, it showed causality of temps between the two cards fine but it's not the complexity of it, but more the left to right nature of how we read sentences is broken on this graph....it even has arrow pointing right to left. This may work for certain languages or people who go right to left but at least for English, Spanish, etc we are trained to go left to right.
But the X axis values increases from left to right. Y axis values increase from bottom to top. That's the standard for graphs, independent of language. Pretty intuitive I would say. If they made the values for noise decrease from left to right, now THAT would be confusing. You cant expect all graphs to have ascendant lines.
Posted on Reply
#139
dalekdukesboy
Ok, then why would you have a big obvious arrow going from the right to left to throw off the left to right continuity? No, but ascendant lines tend to work best, 2 of 3 graphs here are that way for example. I simply am pointing out graph is far from "perfect" and many people not "getting" it pretty much proves that.
Posted on Reply
#140
efikkan
GhostRyderHow so? While its definitely a better deal its essentially a Titan XP without 1gb of ram and with better voltage control/Boost clocks. That's not completely interesting even if its a good price.
Well, for starters it's 35% better than GTX 1080, and secondly it's reducing the price of GTX 1080. Thirdly, it's roughly the same price per dollar as GTX 1080 after the price adjustment, and it's amazing to see a high-end model retaining this awesome value while delivering the best performance. Forth, it's the best high-end Ti model ever, much better than 980 Ti and 780 Ti. Fifth; great energy efficiency and hopefully some overclocking headroom.
Posted on Reply
#141
qubit
Overclocked quantum bit
EarthDogDo tell why an 11GB setup is 'less efficient' than12GB...
I didn't say that. It's less efficient than a power of 2 design. In this case, you'd need to have 16GB RAM for a "perfect" design. You always need to go to the next power of 2 up.

It'll be interesting to see if that 11GB RAM has a similar issue as the GTX 970 with that slow memory due to the cut down GPU. I suspect it won't though as NVIDIA have learned their lesson from that particular scandal.
Posted on Reply
#142
Air
dalekdukesboyOk, then why would you have a big obvious arrow going from the right to left to throw off the left to right continuity? No, but ascendant lines tend to work best, 2 of 3 graphs here are that way for example. I simply am pointing out graph is far from "perfect" and many people not "getting" it pretty much proves that.
The only flaw i could point is that it does not inform ambient temperature (which i think can safely be assumed to be 25 °C).

Honestly i can't think of any other way to portray the same amount of information in a simpler way. You cant just make a graph have an ascendant line, its not something you can chose. Noise x temperature graphs for coolers will always have a descendant line. Change it to fan speed x temperature and it will be similar.

Well i guess you could change it to fan noise x heat dissipation, at a fixed GPU temperature instead of power, which would result in ascendant lines. But really not the usual information people look for when choosing coolers.

Whats causing confusion is not the graph per se, but some lack of experience in interpreting graphs. If you take your time and observe the information it presents its pretty clear.
Posted on Reply
#143
efikkan
qubitI didn't say that. It's less efficient than a power of 2 design. In this case, you'd need to have 16GB RAM for a "perfect" design. You always need to go to the next power of 2 up.
That's completely wrong. Each memory controller is still accessing a power of 2 amount of memory. Memory controllers in GPUs work independently. There is no scientific basis of claiming that the total number of resources have to be a power of 2. Just look at the core count in modern GPU, almost none of them add up to a power of 2.
qubitIt'll be interesting to see if that 11GB RAM has a similar issue as the GTX 970 with that slow memory due to the cut down GPU. I suspect it won't though as NVIDIA have learned their lesson from that particular scandal.
That will never happen. You are conflating two unrelated design choices.
The "issue" with GTX 970 was that two 32-bit chips shared a single 32-bit bus, but with the first chip having priority resulting in "unreliable" memory performance. This is actually not new, GTX 660/660 Ti did a similar thing, but nobody complained then.
Posted on Reply
#144
GhostRyder
efikkanWell, for starters it's 35% better than GTX 1080, and secondly it's reducing the price of GTX 1080. Thirdly, it's roughly the same price per dollar as GTX 1080 after the price adjustment, and it's amazing to see a high-end model retaining this awesome value while delivering the best performance. Forth, it's the best high-end Ti model ever, much better than 980 Ti and 780 Ti. Fifth; great energy efficiency and hopefully some overclocking headroom.
Yes, but its still just a Titan XP at the end of the day with better voltage control, better stock clocks, and less ram. Even though its cheaper its just a more affordable Titan, that is good but not exciting as a brand new never before seen chip. It would be interesting if say it had more cores unlocked but that is not the case. I am only interested in it for its price and if it can overclock further.
qubitI didn't say that. It's less efficient than a power of 2 design. In this case, you'd need to have 16GB RAM for a "perfect" design. You always need to go to the next power of 2 up.

It'll be interesting to see if that 11GB RAM has a similar issue as the GTX 970 with that slow memory due to the cut down GPU. I suspect it won't though as NVIDIA have learned their lesson from that particular scandal.
My thoughts as well, I am considering trading my Titan XP in for a pair (Maybe just one) but only after reviews and some time has passed. I want to see if it has those issues as well and what other ones come out (Better VRM versions).
Posted on Reply
#145
EarthDog
qubitI didn't say that. It's less efficient than a power of 2 design. In this case, you'd need to have 16GB RAM for a "perfect" design. You always need to go to the next power of 2 up.

It'll be interesting to see if that 11GB RAM has a similar issue as the GTX 970 with that slow memory due to the cut down GPU. I suspect it won't though as NVIDIA have learned their lesson from that particular scandal.
Ok... but why is a power of 2 more efficient? My apologies here for being dense...

Again, it shouldnt have that 970 issue. The back end ROPs (read: the math) seems to all jive to me?
Posted on Reply
#146
efikkan
GhostRyderWhile its definitely a better deal its essentially a Titan XP without 1gb of ram and with better voltage control/Boost clocks. That's not completely interesting even if its a good price.
GhostRyderYes, but its still just a Titan XP at the end of the day with better voltage control, better stock clocks, and less ram. Even though its cheaper its just a more affordable Titan, that is good but not exciting as a brand new never before seen chip. It would be interesting if say it had more cores unlocked but that is not the case.
Why isn't it interesting that you can get the high-end consumer card for 58% of the price of the professional card?
Why isn't GTX 1080 Ti interesting when it reduces the prices of the remaining lineup as well?
Anyone interested in buying a decent card soon should be cheering, it's in fact the biggest news of the year.
EarthDogOk... but why is a power of 2 more efficient? My apologies here for being dense...
Power of 2 matters for certain things when it comes to building integrated circuits. Allocations in system memory, allocations in GPU memory, sizes of sectors on SSDs/HDDs, etc. are all power of 2 because it decreases the complexity of the integrated circuits.

Let me crate a small example:

You have 4 memory modules of 16kB (16384 bytes)
Now, let's look at the address space in binary:
0 First address: 0000 0000 0000 0000
Last address: 0011 1111 1111 1111

1 First address: 0100 0000 0000 0000
Last address: 0111 1111 1111 1111

2 First address: 1000 0000 0000 0000
Last address: 1011 1111 1111 1111

3 First address: 1100 0000 0000 0000
Last address: 1111 1111 1111 1111
Do you see any pattern?

You can use the two first bits to check which memory module the address belongs to, so the memory controller just needs a few transistors to calculate this, instead of some complex transformation of the address. The remaining 14 bits becomes the internal address space of a module. Then the module does the same thing to find out which chip("memory bank") the address belongs to.

Now let's compare this with an address space that's power of 10 instead, four modules of 10.000 bytes:
0 First address: 0000 0000 0000 0000
Last address: 0010 0111 0000 1111

1 First address: 0010 0111 0001 0000
Last address: 0100 1110 0001 1111

2 First address: 0100 1110 0010 0000
Last address: 0111 0101 0010 1111

3 First address: 0111 0101 0011 0000
Last address: 1001 1100 0011 1111
Even though an address space that's power of 10 is much more simple for us humans, it becomes obviously much harder for computers.

-----

So back to the question at hand, does it matter that GTX 1080 Ti have a total memory bandwidth of 352-bit? No, as I've said numerous times already, it has 11 separate 32-bit controllers, each accessing a power of 2 address space, adding up to a continuous address space without any kind of performance penalty. So it's not any kind of problem with 352-bits total, and if you know math you'll know that even 384-bit is not a power of 2!

An analogy; your harddrive consists of sectors, where every single one is a power of 2 in size, but the total count never is.

Edit:
Memory controllers for GPUs are in fact even more simple than CPU memory controllers. Not only is the size of allocations power of 2, but when allocating buffers for textures etc. each dimension has to be a power of 2. If you create a texture of 144×129, your API will pad it to 256×256.
Posted on Reply
#148
GhostRyder
efikkanWhy isn't it interesting that you can get the high-end consumer card for 58% of the price of the professional card?
Why isn't GTX 1080 Ti interesting when it reduces the prices of the remaining lineup as well?
Anyone interested in buying a decent card soon should be cheering, it's in fact the biggest news of the year.
Because its something we can already expect and see out in the open. Its good that it reduces prices across the board, however the card itself is not that interesting. That does not mean it wont sell well and its not a good card, it means its boring because we already know its basics and where its going to sit on the performance chart. The most interesting part is how its going to react to having higher voltage control because that will result in better clocks over the Titan XP. The price is not interesting, just great news because its more affordable to the masses. I am still interested in it, but its not a mystery for the most part.
Posted on Reply
#149
qubit
Overclocked quantum bit
EarthDogOk... but why is a power of 2 more efficient? My apologies here for being dense...

Again, it shouldnt have that 970 issue. The back end ROPs (read: the math) seems to all jive to me?
Basically, it's all to do with addressing and building the infrastructure for it inside the chip. I'm going to assume that you're familiar with base 2 (binary) and number bases in general here.

To make for a really simple example, imagine that you have a memory chip with just 4 locations. These will take 2 bits to address, ie a 2-bit address bus. The value of the bottom (first) address will be zero (00 binary) and the last (top) address 3 (11 binary).

Now imagine a lopsided memory chip with just 3 locations. You will still need to build the infrastructure for 4 addresses into the chip, since the top bit is still being set, ie value 2 (10 binary) with the top address of 3 (11 binary) pointing nowhere and likely having to be masked off to avoid a crash. Hence the chip will still take the same number of transistors as if it had 4 locations, but not actually have that extra location in it and therefore the chip will not be an optimal design. Of course, what you get back is that the extra circuitry for the 4th location is missing, saving space, hence making for a compromise.

You have a similar situation regardless of what you're addressing, whether it's CUDA units and the number of bits they each handle in a GPU, or the number of CUDA units in the GPU, or whatever aspect of a digital circuit.

The problem in the real world of course, is that building a perfect power of 2 chip causes the number of transistors and physical size of that chip to double each time it's expanded, ie to grow exponentially which is unsustainable.

When you get to the large sizes of modern GPUs with their billions of transistors, it would tend to quickly outgrow the manufacturing capabilities of current technology. Or if not for a particular design, it would just be excessively large, such as being, for example, 40 millimeters on a side which is impractical for a commercial product that's supposed to make a profit.

No doubt it would also use a tremendous amount of power and emit a correspondingly tremendous amount of heat, making things difficult. Therefore, we see the lopsided GPUs of today to avoid this fate, or at least reduce its impact. Think of the GTX 480 and the tremendous amount of power and heat it used, despite being such a lopsided design. It's a shame and I really don't like this lopsidedness, but there's no choice for a real world GPU.

If you're curious, check out the designs of older entry level GPUs, where you'll see that quite often everything is a perfect power of 2, eg data bus, CUDA cores etc, since it's practical to do so at the smaller sizes.

The 970 memory issue came about, because NVIDIA nibbled a bit off the GPU, giving rise to a compartmentalized memory addressing design, where they chose to use slow RAM for that last 500MB, but didn't declare it, leading to this scandal.

When I saw that the 1080 Ti with its weird 11GB RAM and crippled GPU, it brought back to me that NVIDIA could potentially have the same design issue. However, it all really depends on the details of the design whether this happens or not and we'll soon know once the official reviews are out. I doubt they'd repeat the same mistake, especially on their flagship product.

@efikkan back there thinks I'm "completely wrong" about a power of 2 chip being optimal, but I'm not, as I've explained above. He just didn't quite understand what I was saying.

Oh and you asked for it - check my sig! :p
Posted on Reply
#150
efikkan
qubitNow imagine a lopsided memory chip with just 3 locations. You will still need to build the infrastructure for 4 addresses into the chip, since the top bit is still being set, ie value 2 (10 binary) with the top address of 3 (11 binary) pointing nowhere and likely having to be masked off to avoid a crash. Hence the chip will still take the same number of transistors as if it had 4 locations, but not actually have that extra location in it and therefore the chip will not be an optimal design. Of course, what you get back is that the extra circuitry for the 4th location is missing, saving space, hence making for a compromise.
No memory controller works the way you describes.
For starters, the memory controllers on GPUs have all power of 2 address space as I've said a number of times already, how hard is this to understand?
But for the hypothetical scenario where 3 out of 4 memory slots i occupied, the memory controller will never check if a memory address is inside the range on read/write, that would be too costly anyway. The whole "problem" is solved on allocation of memory (which is costly anyway, and done very rarely compared to read/write), and the only thing to check then is whether the memory address is above the maximum size, so the problem you describes doesn't exist.

Just to illustrate how wrong you are, I checked two of the machines I'm running here;
i7-3930K, 46-bit controller, 65536 GB (64 TB) theoretical physical address space, but the CPU is "limited" to 64 GB.
i5-4690K, 39-bit controller, 512 GB theoretical physical address space, but the CPU is "limited" to 32 GB.
(This is fetched directly from the CPU's cpuid instruction, so it's what the OS sees and is guaranteed to be correct)
qubitYou have a similar situation regardless of what you're addressing, whether it's CUDA units and the number of bits they each handle in a GPU, or the number of CUDA units in the GPU, or whatever aspect of a digital circuit.
Number of bits of what? Data bus? Memory bus? Register width?
If you look at GPU architectures you'll see that most of then don't have a core count which adds up to a power of 2, like 256, 512, 1024, 2048, 4096, etc. Just scroll through here and here, that power of 2 is more the exception than the rule.
qubitThe problem in the real world of course, is that building a perfect power of 2 chip causes the number of transistors and physical size of that chip to double each time it's expanded, ie to grow exponentially which is unsustainable.
The term "a perfect power of 2 chip" doesn't make any sense.
qubitThe 970 memory issue came about, because NVIDIA nibbled a bit off the GPU, giving rise to a compartmentalized memory addressing design, where they chose to use slow RAM for that last 500MB, but didn't declare it, leading to this scandal.

When I saw that the 1080 Ti with its weird 11GB RAM and crippled GPU, it brought back to me that NVIDIA could potentially have the same design issue…
The GTX 970 "issue" was that two memory chips shared one 32-bit controller, while the others didn't, creating an address space where some of it was slower without the allocator taking this into account. Power of 2 had absolutely nothing to do with it.

If GTX 1080 Ti were to do the same thing it would have to do 12 memory chips on 11 controllers, which we know it doesn't, so we know it can't happen. If you still think it's a problem, then you're having a problem understanding how processors and memory work.
qubitOh and you asked for it - check my sig! :p
Your avatar is cool though.
Posted on Reply
Add your own comment
Apr 19th, 2024 03:35 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts