1. Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

HD 5870 Discussion thread.

Discussion in 'AMD / ATI' started by a_ump, Oct 25, 2009.

Thread Status:
Not open for further replies.
  1. HalfAHertz

    HalfAHertz

    Joined:
    May 4, 2009
    Messages:
    1,893 (0.95/day)
    Thanks Received:
    380
    Location:
    Singapore
    I' don't say you're wrong, I just partially dissagree with your stance :)

    Memory bandwith is very important, however a smart and efficient design is much more important in my opinion and this is exactly what ati was aiming for. They tried to increase the calculation productivity while still staying in a certain size, energy consumption and cost frame.

    I do believe all the woes with the 5870 are mainly driver/software related. Why else would a pair of HD5770s which theoretically provide the same computational power constantly outperform the single 5870? They do use the same architecture and offer similar bandwith (2x76Gb/s), don't they?

    Edit:

    I based my asumptions on the Guru3d tests found http://www.guru3d.com/article/radeon-hd-5770-review-test/16 . The 5770 pair's advantage varies, sometimes going up to 25%, but on average they are about 7% faster. When you consider that we don't get a perfect 100% scaling in crossfire, this is indeed a noticable difference isn't it?
     
    Last edited: Nov 5, 2009
  2. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.44/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Because of my theory*. SPs, TMUs, ROPs and MC get all the attention, but a GPU is much more thatn just that, and the rest is just as important, if not more.

    I'm not saying that I'm right and the rest is wrong, but I think we should take that into account too. IMO it can't be memory bandwidth, we could add a maximum 5% performance increase from memory bandwidth, IMO. It can't be drivers either, drivers alone: it would be the first time that drivers had made such a difference. IMO we can't attribute more than 20% to software side, always based on previous examples. There's another 25% left in order to rech the magical 2x increase over HD4890 and IMO it's attributable to inefficiencies in an aging architecture that Ati themselves are abandoning (next Ati chip will be a complete redesign).

    IMO what was designed to work on 320 SPs or 4 clusters can't still be just as efficient on 1600 SP/ 20 clusters. I see some sense in that, because HD58xx is the only GPU architecture where the number of clusters exceeds the number of shader processors per cluster, the balance is absolutely different from that in RV670 and if one was balanced the other can't be very balanced IMO. But that is just what I think.

    * Two HD5770 have twice the schedulers than a single HD5870 for the same number of SPs. They also probably have the same internal crossbar communication as its bigger brethren: the same internal bandwidth, etc. Each of them I mean.
     
    Last edited: Nov 5, 2009
  3. Lionheart

    Lionheart

    Joined:
    Apr 30, 2008
    Messages:
    4,063 (1.72/day)
    Thanks Received:
    811
    Location:
    Milky Way Galaxy
    i too think that the memory bit interface should be topped up a notch, i mean come on where still using 256bit memory interface, and yes i know its GDDR5 but still, how long have we been using 256biy for now, last time i checked it was back in 2003 or 2004 or sumthing like that!
     
  4. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.44/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    We could see some imporvements from a wider memory bus, but IMO only marginal. As we move to higher bandwidths we move to grounds of diminished returns. Kind of like this one.

    [​IMG]

    That's from: http://www.techpowerup.com/reviews/AMD/HD_5870_PCI-Express_Scaling/25.html

    Note that they are comparing anything form x1 to x16 and that COD4 is the one that is most affected. Average is this one:

    [​IMG]

    1/16 of the bandwidth gives 50-75% of the performance. And x4 to x16 only affects a 5-10% of the performance. Unless Ati is completey incompetent, which is not, the memory bandwidth must be near the hot spot, like maybe x4 in these charts. If they moved to 512 bit they would gain a 5% and another 5% going to 1024 bits, is it worth the effort? Certainly not.
     
    wolf says thanks.
  5. Lionheart

    Lionheart

    Joined:
    Apr 30, 2008
    Messages:
    4,063 (1.72/day)
    Thanks Received:
    811
    Location:
    Milky Way Galaxy
    I guess ur right, the HD2900xt's certainly didnt use all of their memory bandwidth, only sum games it did.
     
  6. a_ump

    a_ump

    Joined:
    Nov 21, 2007
    Messages:
    3,612 (1.43/day)
    Thanks Received:
    376
    Location:
    Smithfield, WV
    where has been stated that ati are doing a completely new chip next gen? i think it's a great idea for ati to do but im jc where it was said.
     
  7. wolf

    wolf Performance Enthusiast

    Joined:
    May 7, 2007
    Messages:
    5,543 (2.03/day)
    Thanks Received:
    842
    Big +1 on that one, they've done 512 bit before, they didn't do it this time for a reason, I don't think performance is really that reason, Maybe cost and/or timing, after all beating Nv to the punch by a good few months is going to do them a whole lot of good.
     
  8. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.44/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Uh, I think that I've seen it in many places. TBH now that I think about it, I'm not sure, it's that kind of things that you simply believe when you read it more than once, I never really questioned it. You know it's going to happen sooner or later and the timing is just perfect for them to do it anyway.

    Apparently all the sites where I saw that info were citing Fudzilla, so here are the articles in question:

    http://www.fudzilla.com/content/view/15891/1/
    http://www.fudzilla.com/content/view/15918/1/

    Ok, so it's Fudzilla, but they have put some pretty detailed naming info there to be made up IMO.
     
  9. Binge

    Binge Overclocking Surrealism

    Joined:
    Sep 15, 2008
    Messages:
    6,982 (3.13/day)
    Thanks Received:
    1,752
    Location:
    PA, USA
    They also just published today that ATI's next card is 28nm. They're skipping 32nm all together.

    Source: http://www.fudzilla.com/content/view/16299/34/
     
  10. grimeleven New Member

    Joined:
    Oct 10, 2009
    Messages:
    19 (0.01/day)
    Thanks Received:
    8
  11. Steevo

    Steevo

    Joined:
    Nov 4, 2005
    Messages:
    8,361 (2.55/day)
    Thanks Received:
    1,217
    I am starting to agree with you Bene, it seem the triangle setup engine is the limiting factor, perhaps ATI thinks with tessellation hardware the number of raw defined triangles that need to be drawn/pushed from the game thread is not going to increase in future games, and the lower framerate will not really be effected by the use of tessellation and new DX11 implementation, 90FPS in a large format with tessellation and other advanced features can deliver a stunning visual experience with very little added load on the new hardware.
     
    10 Million points folded for TPU
  12. HalfAHertz

    HalfAHertz

    Joined:
    May 4, 2009
    Messages:
    1,893 (0.95/day)
    Thanks Received:
    380
    Location:
    Singapore
    You do have a point there.The thread dispencer is a simple ASIC so it shoudn't be too difficult to beef it up, the question is did they increased its otput accordingly...It is a superscalar architecture after all and even if it never reaches perfect scaling, you should come pretty close if all the necesary components are increased.
     
  13. kid41212003

    kid41212003

    Joined:
    Jul 2, 2008
    Messages:
    3,584 (1.55/day)
    Thanks Received:
    533
    Location:
    California
    I read somewhere (btaur's post?) in this forum that the next node for GPU is 28nm, and not 32nm, because of the different process technology between GPU and CPU. I might remember it wrong though...
     
  14. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.44/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Yeah that's true. I'm not very sure with the dates though. IMO either the next Ati chip is not 28nm or it won't be released in 2010.
     
  15. Bo_Fox

    Bo_Fox New Member

    Joined:
    May 29, 2009
    Messages:
    480 (0.24/day)
    Thanks Received:
    57
    Location:
    Barack Hussein Obama-Biden's Nation
    Usually, with each generation change (a true generation change where the GPU core is generally 2x as fast in theory), there is more than 50% increase in memory bandwidth.

    Going from a 7900GTX to an 8800GTX, there was nearly 75% increase in memory bandwidth.

    Going from X1950XTX to HD2900XT, there was exactly 100% increase in memory bandwidth, but as a couple of guys here deliberately pointed out, there's also a lot to do with the efficiency of the driver algorithms--in this case, a 2900XT sucked when it came to using AA based on the execution of the shader units. Both a 2900XT and a 3870 actually performed worse than an X1900XTX in some games when FSAA was being used.

    Anyways, back to the point, we nearly always benefit from more memory bandwidth.

    If an 9800GTX had the same 768MB of 384-bit memory as an 8800 Ultra, it would have beaten it badly across all resolutions and modes. Instead, a 9800GTX with 2 times the TMU's, higher "fillrate" GFLOPS theoretical performance, more transistors, higher core clock and shader clocks still lost to an 8800Ultra (and I would not blame 16/24 ROP's as much as the memory bandwidth as a 4890 still did fine with 16 ROP's).

    There goes...
     
  16. kid41212003

    kid41212003

    Joined:
    Jul 2, 2008
    Messages:
    3,584 (1.55/day)
    Thanks Received:
    533
    Location:
    California
    Summary:
    Rocket scientist (more than 1) designed the HD5000 series.
    The current performance of HD5870 is good enough. (Is there any single GPU card that's faster than 5870? No.)
    Memory limited or not, doesn't matter, they made it that way.

    Future:
    HD5890 will have higher/bigger bussy, an upgraded HD5870, following with an dual GPUs card to compete with NVIDIA's next gen.
     
  17. Binge

    Binge Overclocking Surrealism

    Joined:
    Sep 15, 2008
    Messages:
    6,982 (3.13/day)
    Thanks Received:
    1,752
    Location:
    PA, USA
    I must protest, if they are starting right now then it means they have a shot. This is the kind of foresight and decision making that probably kept NV from moving forward. I don't know that NV was held back due to taking their time, but if I had to guess I'd say they waited to long to get back to designing the next step.
     
  18. Bo_Fox

    Bo_Fox New Member

    Joined:
    May 29, 2009
    Messages:
    480 (0.24/day)
    Thanks Received:
    57
    Location:
    Barack Hussein Obama-Biden's Nation
    Yes, a 5870 is currently the fastest GPU, but the full potential is not being unleashed just like a 5770.

    A 5770 is still a "downgrade" compared to a 4890 despite additional DX11 features and slight architecture optimizations, only because of the 2.4Gbps bandwidth instead of a 3.9Gbps one. Just look at TPU's benchmarks that I copy-n-pasted a few posts ago. It does not take a rocket scientist to figure this out.
     
  19. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.44/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    My comment was more regarding TSMC/Globalfoundries than AMD. If they say they will have the process for Q4 2010, I highly doubt they will be even remotedly prepared for launch production. There's been like what 9-12 months since 40nm was "ready for production"? It's not only a problem in TSMC, I think that most foundries have been havng delays with their productions, including Intel AFAIK. So I don't think it's an isolated issue, IMO it's like CPU/GPU design time, it just went up. We could expect them to do it better next time, but IMO it's safer to assume they will not do much better. 1-3 months of delay and you are already in 2011 depending of what Q4 really means. 3-6 months and you are closer to Q3 2011 and if it's just as disastrous as 40nm you are almost in 2012.
     
    Binge says thanks.
  20. Binge

    Binge Overclocking Surrealism

    Joined:
    Sep 15, 2008
    Messages:
    6,982 (3.13/day)
    Thanks Received:
    1,752
    Location:
    PA, USA
    I understand and agree. Some food for thought is that it's easier to go from a lower process design and just magnify it for a higher process than the reverse. Thanks for what you've contributed to this so far Benet.
     
  21. bobzilla2009 New Member

    Joined:
    Oct 7, 2009
    Messages:
    455 (0.25/day)
    Thanks Received:
    39
    i would imagine 28nm will just be a pain to move to tbh. They talk about it trivially to the press, like it's a simple step, but the fact is a 28nm half pitch width is only about twice the absolute minimum resolution of current immersion photo lithography techniques [maybe even less, although i'm pretty sure immersion lithography is pretty much limited to around 20nm half pitch at any decent yield rate (absolute resolution limit is about 16nm using pure water i believe, but that would be horrible for yield rates), double patterning gets us down to 16nm, then that's pretty much the end of normal CMOS].

    This is where the bigger errors start to come into play, when you get this close to the limits of the technology the slightest mistakes are disastrous, since reducing the distance between nodes, even only slightly, will drastically increase the probability of the current tunnelling through the chip on its own merry way across the circuit. So it won't surprise me in the slightest if 28nm is later than expected, or is hugely inefficient with regards to yield when it does. However, the next few years will be a fantastic time for computing in general, since we will be moving from the CMOS setup that has governed how we make computer chips for the last 30 years or so, to more exciting possibilities :)
     
    Last edited: Nov 6, 2009
    Benetanegia and Bo_Fox say thanks.
  22. Bo_Fox

    Bo_Fox New Member

    Joined:
    May 29, 2009
    Messages:
    480 (0.24/day)
    Thanks Received:
    57
    Location:
    Barack Hussein Obama-Biden's Nation
    Here's to hoping that the Foundry will start churning out 28nm stuff ASAP.

    If I were ATI, I'd go ahead and work on 32nm designs. I do not think it's such a good idea to just skip 32nm and wait for 28nm, which might be plagued with the same delays of 40nm. If Nvidia skipped 65nm, Nvidia would have been several months behind in bringing out that GTX 280.

    Well, if ATI does skip 32nm, then there would be mounting pressure to do a more powerful revision of R800 (5870) with 512-bit bus bandwidth to try to counter the GT300. I actually think that the GT300 will have a hard time beating a 512-bit 5890 (at least 100MHz clock increase over a 5870), which would buy ATI some time while skipping 32nm.
     
  23. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.44/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    Me too.

    AFAIK 32nm is SOI, while it's bulk what they use for graphic cards. I don't know why though, I only know that's how it is.

    I don't think so myself. I don't think they will release a 512 bit card, it wouldn't make a big difference and I don't think they would be faster than GT300 anyway after seeing how HD5870 performs. But that's just me expecting a real gen-to-gen improvement in the GPU arena. I'm expecting GT300 to be as fast as GTX280 or GTX285 SLI +/- 10% which is much faster than a GTX295, which is faster than the HD5870. I don't see why GT300, having more than twice the raw power, couldn't be as fast as 2xGTX285. Nvidia cards have scaled nicely in the past. GTX280 was as fast as 9800 GX2 once the first driver problems were solved. It's not as fast as 9800 GTX SLI, true, but 9800 GTX runs at 738 Mhz while both 9800GX2 and GTX280 run at 600 Mhz. Unless GT300 runs at very low frequencies (<500 mhz) it shouldn't have a problem being faster enough, so that a 100mhz improvement on Ati is a problem for them. But again, that's just me expecting a real gen-to-gen improvement in the GPU arena. If it has to come from just one of the brands, so be it, that's what implies being a tech yonkie and enthusiast IMO. If that makes prices too high, I'd not buy it or I'd buy the cheaper Ati, but I just want such a device to exist, just like I want the Buggatti Veyron or F1 racing car to exist. I think that the market would put Nvidia cards at their just price anyway, just as they've done until now.
     
  24. wolf

    wolf Performance Enthusiast

    Joined:
    May 7, 2007
    Messages:
    5,543 (2.03/day)
    Thanks Received:
    842
    Actually the 9800GTX was 675mhz, the GTX+ and GTS250 are 738, but your point is well made, also per GPU the 9800's pack more texturing ability with two than a single GTX280/285, so there is advantages to be had with two 9800GTX's over a single GTX280/285, as well as obvious disadvantages.

    I also am swaying to think the 5870 wouldn't have much to gain from a 512-but bus, IMO they will need a dual GPU to beat or compete well with Fermi, as has been the past, but hey, that's my speculation for the time being :rolleyes:
     
  25. Benetanegia

    Benetanegia New Member

    Joined:
    Sep 11, 2009
    Messages:
    2,683 (1.44/day)
    Thanks Received:
    694
    Location:
    Reaching your left retina.
    True I had forgotten about that the GTX ever existed. My brother has a 9800GTX+ but we always call it just GTX, that's why.
     
    wolf says thanks.

Currently Active Users Viewing This Thread: 1 (0 members and 1 guest)

Thread Status:
Not open for further replies.

Share This Page