• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Official AMD Radeon 6000 Series Discussion Thread

Joined
Nov 4, 2005
Messages
11,689 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs and over 10TB spinning
Display(s) 56" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
You should read better into that article. In Metro 2033 all 6 cores of the test bed are being maxed out, and the CPU is clocked at 4 Ghz. It's still far behind the GPU and Metro does not have a lot of physics goign on anyway. Difinately not compared to the level of physics that Cadaveca is talking about, which I guess is the same I want. If you add SSE2 you'd get 10-20% more performance which would change nothing.

Physics are better run on the GPU because that kind of task is much better run on a lot of cores with superfast local memory and that's what a GPU is. You say you want your 6/12/24 cores being used, but as I see it that's your problem, maybe you should start thinking about spending less on the CPU and more on the GPU as that is the future. Future battles will occur between integrated GPU (Fusion and fusion-like architectures) and discreet GPU and not GPU vs CPU, because CPU as we know it will move to a secondary/auxiliary role. There's so very little tasks that require big conventional CPUs (multi-core) even today, that cannot be done much better on the GPU...

Bottom line is, we need fast CPUs, but current CPUs are not fast per se, the actual cores are only barely faster than what 2005 CPUs were, we just have 4/6/12 of them crammed together and that's not fast, that's parallel, and if a task is parallel enough to be adecuate to max out a current 6 core CPU, it's most probably adecuate for GPU computing too and 90% of the times the GPU will be orders of magnitude faster, more power efficient and significantly cheaper to produce.

http://www.realworldtech.com/page.cfm?ArticleID=RWT070510142143&p=2


cough
 
Joined
Jun 13, 2009
Messages
1,539 (0.28/day)
Location
Canada/Québec/Montreal
System Name Main PC
Processor PII 925 x4 @3.724GHz (266x14) 1.525v NB 2660 1.425v
Motherboard Gigabyte AM3 GA-890XA-UD3 (790x+SB850)
Cooling Scythe Mugen 2 rev.B
Memory Hyperx 8GB (2x4) 1600@1418 8-7-7-20-27-1t
Video Card(s) GTX 680
Storage 256GB SSD / 2TB HDD
Display(s) LCD Samsung 24" 16:9
Case Cooler Master HAF 912
Audio Device(s) On-Board HD
Power Supply CM 750w GX |3.3v@25a|5v@25a|12v@60a
Software Kubuntu dual boot /Windows 7 Ultimate 64bit
Benchmark Scores later...
You can't compare a dual gpu card with a single gpu one.

I know just saying...

And from rumors, nvidia is preparing a dual card based on the GTX 570.

Will be cool to see that against AMD 6990 , i think we'll have a good year (2011) for competition , bring it on!:)
Things are so boring without competition...:mad:

Time for AMD to get a good CPU to compete against Intel lets hope that bulldozer will be the one?
 

bear jesus

New Member
Joined
Aug 12, 2010
Messages
1,534 (0.31/day)
Location
Britland
System Name Gaming temp// HTPC
Processor AMD A6 5400k // A4 5300
Motherboard ASRock FM2A75 PRO4// ASRock FM2A55M-DGS
Cooling Xigmatek HDT-D1284 // stock phenom II HSF
Memory 4GB 1600mhz corsair vengeance // 4GB 1600mhz corsair vengeance low profile
Storage 64gb sandisk pulse SSD and 500gb HDD // 500gb HDD
Display(s) acer 22" 1680x1050
Power Supply Seasonic G-450 // Corsair CXM 430W
In performance?
I think not , what i think is that the 6970 will be more or less equal to the 580 & most likely cheaper to buy

I really hope you are right, i want to go back to a single card but with 3 monitors so my only real choice is the 6970, I'm just hoping that by selling my pair of 6870's i will have enough cash in my account to order on launch day.
 
Joined
Jun 13, 2009
Messages
1,539 (0.28/day)
Location
Canada/Québec/Montreal
System Name Main PC
Processor PII 925 x4 @3.724GHz (266x14) 1.525v NB 2660 1.425v
Motherboard Gigabyte AM3 GA-890XA-UD3 (790x+SB850)
Cooling Scythe Mugen 2 rev.B
Memory Hyperx 8GB (2x4) 1600@1418 8-7-7-20-27-1t
Video Card(s) GTX 680
Storage 256GB SSD / 2TB HDD
Display(s) LCD Samsung 24" 16:9
Case Cooler Master HAF 912
Audio Device(s) On-Board HD
Power Supply CM 750w GX |3.3v@25a|5v@25a|12v@60a
Software Kubuntu dual boot /Windows 7 Ultimate 64bit
Benchmark Scores later...
I don't get why you want to sell them...?
3 monitors shouldn't be a problem with a single 6870 not to mention 2 in xfire
 
Last edited:

bear jesus

New Member
Joined
Aug 12, 2010
Messages
1,534 (0.31/day)
Location
Britland
System Name Gaming temp// HTPC
Processor AMD A6 5400k // A4 5300
Motherboard ASRock FM2A75 PRO4// ASRock FM2A55M-DGS
Cooling Xigmatek HDT-D1284 // stock phenom II HSF
Memory 4GB 1600mhz corsair vengeance // 4GB 1600mhz corsair vengeance low profile
Storage 64gb sandisk pulse SSD and 500gb HDD // 500gb HDD
Display(s) acer 22" 1680x1050
Power Supply Seasonic G-450 // Corsair CXM 430W
I don't get way you want to sell them...?
3 monitors shouldn't be a problem with a single 6870 not to mention 2 in xfire

With older games they run great on a single card at 5040x1050 but i start running into problems with newer games.

For one i would feel more comfortable having a 2gb card for gaming at 5040x1050 as some levels of AA are like a kick in the nuts to my setup and its exactly the same when using one or two cards so i would expect it's the memory usage and not the cards power.

Also i have never liked multiple card setups as every time i have had sli or crossfire there has always been some kind of problem, i admit though so far i have had not problems with the 6870's in crossfire apart from a little trouble when i tried to over volt them for some benchmarking which brings me to my next issue.

Even at 4ghz core, 2.25ghz north bridge and HT my CPU seams to be holding them back as faster cpu's are showing a much better increase going between one and two cards than the phenom II 965 so keeping my CPU and upgrading to even a less powerful single gpu would likely still give me an improvement due to my CPU not having to try and keep up with two cards but I'm hoping the 6970 will be at least as fast as a pair of 6870's with imperfect scaling.

One more problem multiple cards gives me is a limit on what aftermarket cooler i can use, i would love to have a thermalright shaman but it would be very tight with two cards and if i wanted to go with water cooling at some point then two cards means double the cost and I'm a cheapskate :laugh:
 
Joined
Jul 19, 2008
Messages
1,180 (0.20/day)
Location
Australia
Processor Intel i7 4790K
Motherboard Asus Z97 Deluxe
Cooling Thermalright Ultra Extreme 120
Memory Corsair Dominator 1866Mhz 4X4GB
Video Card(s) Asus R290X
Storage Samsung 850 Pro SSD 256GB/Samsung 840 Evo SSD 1TB
Display(s) Samsung S23A950D
Case Corsair 850D
Audio Device(s) Onboard Realtek
Power Supply Corsair AX850
Mouse Logitech G502
Keyboard Logitech G710+
Software Windows 10 x64
2 weeks to go YAY!!
 
Joined
Feb 24, 2009
Messages
3,516 (0.63/day)
System Name Money Hole
Processor Core i7 970
Motherboard Asus P6T6 WS Revolution
Cooling Noctua UH-D14
Memory 2133Mhz 12GB (3x4GB) Mushkin 998991
Video Card(s) Sapphire Tri-X OC R9 290X
Storage Samsung 1TB 850 Evo
Display(s) 3x Acer KG240A 144hz
Case CM HAF 932
Audio Device(s) ADI (onboard)
Power Supply Enermax Revolution 85+ 1050w
Mouse Logitech G602
Keyboard Logitech G710+
Software Windows 10 Professional x64
Yep. but not everything is doubled...it seems that Cayman may have the final bits and pieces to make it almost a complete dualcore gpu on a single chip, except the memory interface.

Yea but the memory interface has me a little puzzled. I understand that a 5870 E6 doesn't show improvement over a regular 5870 in Eyefinity like you think there would be given the pixel count of those resolutions. After [H] testing "updated" drivers, I'm more inclined to think that the problem is in the drivers not being able to utilize cpu cycles very well. Not so much so (as this memory interface slide seems to suggest) the gpu being able to quickly fetch and send to memory.

Unless I'm missing something.

Btw did you see this one? Didn't quite understand it until I saw the Cayman slides.
 
Joined
Mar 26, 2008
Messages
1,877 (0.32/day)
Location
Cobourg,Ontario
System Name RyZen FX
Processor AMD Ryzen 9 5900x
Motherboard Gigabyte B550 Aorus Elite AX V2
Cooling DeepCool AK400 Zero Dark Plus
Memory Corsair CMK32GX4M2E3200C16 X2 32gig dual channel
Video Card(s) ASUS RX 7700XT TUF OC
Storage x2 Lexar SSD NM710 2TB 2XSeagate 1Terrabyte 1x Seagate 2 Terrabyte
Display(s) 40 Inch Samsung HDTV (monitor)
Case HAF-X:)
Audio Device(s) AMD/HDMI to Onkyo HT-R508 Receiver
Power Supply EVGA SuperNOVA 1000 G2 Power Supply
Software Windows 10 Pro X64
Bring on Dec12 so on Dec13 i pick my 6970 up :) Hoping to crossing again.
 

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.50/day)
Location
Reaching your left retina.


Cough what? lol Please you could learn to read, at least, and read everything before posting such stupid and arrogant post. :laugh:

That article by David Kanter is old and it only demostrated that PhysX uses x87 and then he thoretized a 2x-4x increase in performance from using SSE2 and a linear increase as CPU cores were added. The reality is very different, as has been found by Tom's Hardware by testing on actual games and actual hardware and the difference from x87 and SSE2 is a pitiful 10-20%. Also a fully loaded Phenom X6 is far from beating GPU even if you included the 10-20% difference that x87 would suppose. The CPU has all 6 cores with 90%+ utilization, it would be interesting to see how much of the GPU was actually used. A 20% max surely, that's because there's such a small difference between different GPUs i.e (GTX285 vs 9600GT).

That's without taking into account that GPUs will continue to double or almost double their FP performance every 12-18 months as they've been doing in the past and CPUs will not even come close as it has happened in the past. AMD knows that, Intel knows that, and they are acting accordingly by creating CPU/GPU hybrids, why is it that stupid people can link to a paperwork that they clearly didn0t understand and think they know better than AMD/Intel?
 
Last edited:

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.61/day)
Yea but the memory interface has me a little puzzled. I understand that a 5870 E6 doesn't show improvement over a regular 5870 in Eyefinity like you think there would be given the pixel count of those resolutions. After [H] testing "updated" drivers, I'm more inclined to think that the problem is in the drivers not being able to utilize cpu cycles very well. Not so much so (as this memory interface slide seems to suggest) the gpu being able to quickly fetch and send to memory.

Unless I'm missing something.

Btw did you see this one? Didn't quite understand it until I saw the Cayman slides.

The focus should not be on memory control..more on set-up and execution, and how the load is balanced as it goes through the shaders, IMHO.

To me, it seems that with Cypress's front end, it takes far too much cpu-gpu interaction to fully utilize the shaders. But the real reason why...only AMD knows.

Remember that I had come to this same conclusion many months ago, even took it so far as to be doing compares with differing cpu cache, to try to pinpoint the exact reason why it requires so much cpu speed to fully load the shaders...

In the end, I blamed the gpu's driver, or set-up engine and felt that the sheer number of shaders involved was part of it...

When Barts came out, and with much less shaders, it proved to have much better scaling, so placed the blame more on the set-up engine, and the info out now on Cayman bolsters that even more. Time will tell.

But man, I'm just some jobless guy sitting at home, with too much time on his hands. I am by no means even close to properly understanding what is exactly going on, so I've just kinda given up on the speculation side of things...

In the end, I just want to play games on my 3 monitors, in Eyefinity, with a minimum 60FPS or greater. I have no interest in any products that cannot do that...my single 5870 works just fine @ 1920x1080. I've had the monitors for over a year now, and AMD claims to ahve a working solution...but you know how I feel about THAT.
 
Joined
Nov 4, 2005
Messages
11,689 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs and over 10TB spinning
Display(s) 56" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Physx is single precision, in that application comparing x87 code to SSE2 with single precision performance you move from one op per cycle to two, twice the performance per cycle, so yeah a 2X increase in performance is small. Plus the fact that the X87 caches are much smaller, so the resulting cache misses will cost in performance.


Coughing is stupid and arrogant? Funny that, or was it a link to something you didn't approve to be posted? So anyone posting anything that you don't like is stupid and arrogant. Thanks for the positive, creative, informative, and discussion. Asshole.
 

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.50/day)
Location
Reaching your left retina.
Physx is single precision, in that application comparing x87 code to SSE2 with single precision performance you move from one op per cycle to two, twice the performance per cycle, so yeah a 2X increase in performance is small. Plus the fact that the X87 caches are much smaller, so the resulting cache misses will cost in performance.


Coughing is stupid and arrogant? Funny that, or was it a link to something you didn't approve to be posted? So anyone posting anything that you don't like is stupid and arrogant. Thanks for the positive, creative, informative, and discussion. Asshole.

And again you talk about theoretical numbers, and pretend to be smarter than anyone else. :shadedshu

Tom's Hardware already tested x87 vs SSE2 and the resulting improvement is 10-20% not 2x, not 4x, but 10-20%, is that so difficult to understand? I couldn't care less how much you or even the great Kanter* think SSE2 should theoretically improve performance, actual improvement is 10-20%. (3 times should be enough to get it your brain isn't it?)

Posting an old link is not stupid or arrogant per se. Coughing is not arrogant per se, but posting extremely old article to refute someones claim, when such person is talking about an article from a site that is proving/refuting the old article from Kanter with actual games and hardware IS stupid. And coughing about it as if that completely refutes the other poster's claim, without even taking the time to read the linked article he is talking about is arrogant.

* No irony there. I respect him a lot and is my number one source for computer architectures, but all his work is mostly just theory and in the mentioned article he only tsted how many of the instructions were x87 and how many SSE, not their actual impact on performance. There's much more to actual performance than just "SSE can perform 2 instructions and bla bla bla". For instance, I hear there's many instuctions which don't run natively on sse/sse2 and thus are faster in x87, like sine, cosine, exponential and so on. I can only guess, but there's probably a lot of that stuff in a physics engine.

BTW GPUs have specialised hardware to perform those instructions (every single fat ALU in Ati's stream processors, or the SFUs on Nvidia's GPUs) and is the main reason some people developing CUDA programs are seeing as much as 400x improvement on their applications, despite a GPU not being 400x faster than a CPU overally speaking.
 
Last edited:
W

wahdangun

Guest
You should read better into that article. In Metro 2033 all 6 cores of the test bed are being maxed out, and the CPU is clocked at 4 Ghz. It's still far behind the GPU and Metro does not have a lot of physics goign on anyway. Difinately not compared to the level of physics that Cadaveca is talking about, which I guess is the same I want. If you add SSE2 you'd get 10-20% more performance which would change nothing.

Physics are better run on the GPU because that kind of task is much better run on a lot of cores with superfast local memory and that's what a GPU is. You say you want your 6/12/24 cores being used, but as I see it that's your problem, maybe you should start thinking about spending less on the CPU and more on the GPU as that is the future. Future battles will occur between integrated GPU (Fusion and fusion-like architectures) and discreet GPU and not GPU vs CPU, because CPU as we know it will move to a secondary/auxiliary role. There's so very little tasks that require big conventional CPUs (multi-core) even today, that cannot be done much better on the GPU...

Bottom line is, we need fast CPUs, but current CPUs are not fast per se, the actual cores are only barely faster than what 2005 CPUs were, we just have 4/6/12 of them crammed together and that's not fast, that's parallel, and if a task is parallel enough to be adecuate to max out a current 6 core CPU, it's most probably adecuate for GPU computing too and 90% of the times the GPU will be orders of magnitude faster, more power efficient and significantly cheaper to produce.

but the problem is there are less than 5% people that use more than one GPU, so even still bullet physix had become mainstream the developer still won't include physix to gameplay, but there are a lot more people that use mulicore cpu, along he time cpu will be pack more core and perform much better and in that time i'm really sure the developer didn't mind to make physix part of the gameplay.

but i predict fusion cpu that will make physix more mainstream if the dev can take extra step to code for it
 

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.50/day)
Location
Reaching your left retina.
but the problem is there are less than 5% people that use more than one GPU, so even still bullet physix had become mainstream the developer still won't include physix to gameplay, but there are a lot more people that use mulicore cpu, along he time cpu will be pack more core and perform much better and in that time i'm really sure the developer didn't mind to make physix part of the gameplay.

but i predict fusion cpu that will make physix more mainstream if the dev can take extra step to code for it

You don't need multiple GPUs to begin with and also most gamers with quads or more cores only bought those CPUs because they thought or were told they would be needed. It just takes a change of mind. If you are a gamer and a $250 GPU + $75 GPU gives you better results than a $250 GPU + $250 CPU (+new MB + new RAM) why on earth would you buy such a CPU in the first place when you could just buy the $75 GPU or keep the old GPU when you upgrade?

I have a q6600 and I'm not gonna buy a new CPU until I really need it and that's not now by any means. There's absolutely no reason* to go to i7 or PII X6 (and spend $500++ on the process) when my old 8800GT can do physics better along with the GTX460 doing graphics. And the Tom's Hardware articl demostrates that a 4 GHz 6-core could match a GPU like the 8800GT on physx, sure, but there's no freaking way they can convince me that buying an i7 + mobo + DDR3 is a better option than keeping my old 8800GT as a physx card. And if I didn't have it, more of the same, I would rather buy a cheap GPU.

More so, let's say you go the CPU route, what would happen in 2 years when that CPU is not enough again? You'd need just another complete system overhaul, when it's so easy to buy a cheap GPU that is faster anyway (or again keep the old one). Same applies to APU's tbh. making strong GPUs and making them yield is hard enough, making strong CPUs and making them yield is hard enough. Making a high end CPU+GPU yield is going to be impossible. Not to mention that vram is always going to be much faster than system ram and that's another advantage that GPUs have for parallel tasks like physics.

* Just look at which games trully need a quad/6 core and why. Among others SupCom (AI, physics) and BFBC2 (physics), both AI and physics can be better run with GPU acceleration. CPU is still needed, always, but not 4 ghz 6 cores, 2 or 3 cores 3ghz is more than enough, so I want the industry to move to that model. Plain and simple.
 
Last edited:
Joined
Nov 4, 2005
Messages
11,689 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs and over 10TB spinning
Display(s) 56" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
And again you talk about theoretical numbers, and pretend to be smarter than anyone else. :shadedshu

Tom's Hardware already tested x87 vs SSE2 and the resulting improvement is 10-20% not 2x, not 4x, but 10-20%, is that so difficult to understand? I couldn't care less how much you or even the great Kanter* think SSE2 should theoretically improve performance, actual improvement is 10-20%. (3 times should be enough to get it your brain isn't it?)

Posting an old link is not stupid or arrogant per se. Coughing is not arrogant per se, but posting extremely old article to refute someones claim, when such person is talking about an article from a site that is proving/refuting the old article from Kanter with actual games and hardware IS stupid. And coughing about it as if that completely refutes the other poster's claim, without even taking the time to read the linked article he is talking about is arrogant.

* No irony there. I respect him a lot and is my number one source for computer architectures, but all his work is mostly just theory and in the mentioned article he only tsted how many of the instructions were x87 and how many SSE, not their actual impact on performance. There's much more to actual performance than just "SSE can perform 2 instructions and bla bla bla". For instance, I hear there's many instuctions which don't run natively on sse/sse2 and thus are faster in x87, like sine, cosine, exponential and so on. I can only guess, but there's probably a lot of that stuff in a physics engine.

BTW GPUs have specialised hardware to perform those instructions (every single fat ALU in Ati's stream processors, or the SFUs on Nvidia's GPUs) and is the main reason some people developing CUDA programs are seeing as much as 400x improvement on their applications, despite a GPU not being 400x faster than a CPU overally speaking.

He did mention how much faster SSE2 executes in a modern architecture than X87, specifically that two SSE2 ops can be performed per clock cycle.

I have never said I am smarter than anyone else, when I am presented with clear logic that I can compare to what I know and can check out myself I accept it as fact until proven wrong.

X87 is 80 bits wide. Not 32, 64 or 128.
It requires special registers.
Due to most CPU hardware limitations it can only execute one operation per clock cycle.
It was born out of a 80387 math coprocessor before you and I were out of elementary.
It was originally used for high precision maths that a X86 couldn't do in hardware.
It became obsolete with SSE2 in early 2000's
Due to the nature it is inefficient on any processor that has SSE2 capability. SSE2 can surpass it in terms of pure math performance, cache, branching, and speed on modern processors. Post 2004

Physx. Ported over to CUDA, and allowed to run on G80 hardware that only had single precision. So 32 bits of math.

Modern SSE2 processors can execute 2 or more operations per cycle. (Thus twice the performance at the same level of accuracy in rounding, minus any branch predictions/cache misses)
 
Last edited:

_33

New Member
Joined
Mar 25, 2006
Messages
1,248 (0.19/day)
Location
Quebec
System Name BEAST
Processor Intel Core i7 920 C0 Quad Core Processor LGA1366 2.66GHZ Bloomfield 8MB LGA batch 3844A509
Motherboard ASUS P6T Deluxe V2 X58 ATX LGA1366 DDR3
Cooling CoolIt ECO A.L.C. 120mm radiator
Memory G.SKILL F3-12800CL8TU-6GBPI PC3-12800 6GB 3X2GB DDR3-1600 CL8-8-8-21
Video Card(s) Sapphire Radeon HD 4890 1GB 850MHZ 1GB 3.9GHZ GDDR5 PCI-E 2XDVI HDTV
Storage SAMSUNG SPINPOINT F1 750GB X 2, WD2500KS 250GB, WD15EARS 1.5TB, WD20EARS 2.0TB
Display(s) LG W2361V-PF Monitors 23” Widescreen LCD Monitor (23.0” diagonal)
Case Cooler Master CM 690 II Advanced
Power Supply Thermaltake Toughpower 600W Power Supply ATX V2.2 20/24PIN EPS12V Qfan 140MM Fan Active PFC
Software Windows 7 Pro 64bit
He did mention how much faster SSE2 executes in a modern architecture than X87, specifically that two SSE2 ops can be performed per second.

I have never said I am smarter than anyone else, when I am presented with clear logic that I can compare to what I know and can check out myself I accept it as fact until proven wrong.

X87 is 80 bits wide. Not 32, 64 or 128.
It requires special registers.
Due to most CPU hardware limitations it can only execute one operation per clock cycle.
It was born out of a 80387 math coprocessor before you and I were out of elementary.
It was originally used for high precision maths that a X86 couldn't do in hardware.
It became obsolete with SSE2 in early 2000's
Due to the nature it is inefficient on any processor that has SSE2 capability. SSE2 can surpass it in terms of pure math performance, cache, branching, and speed on modern processors. Post 2004

Physx. Ported over to CUDA, and allowed to run on G80 hardware that only had single precision. So 32 bits of math.

Modern SSE2 processors can execute 2 or more operations per second. (Thus twice the performance at the same level of accuracy in rounding, minus any branch predictions/cache misses)

Two operations per second? You must be meaning 2 operations per cycle?
 
Joined
Nov 4, 2005
Messages
11,689 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs and over 10TB spinning
Display(s) 56" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
Yes I did. If you have 2 and I have 1 then you have twice as much right? Or perhaps Bene is confused by communist government, you have two and give one for to make glorious for the nation of Nvidia, then you give one more for your comrades.

So a 280 for example 240 shaders, 700 Mhz.

240 x 700 = 16,800,000,000 ops per second of single precision, assuming you could max out the core.
6 cores at two ops per cycle at 3.2Ghz= 38,400,000,000 ops per second.

Twice the theoretical math power. I know on Nvidia the shader clock is not tied to the core clock, but on a CPU it has larger amounts of cache built on die, and probably much better branch prediction.
 
Last edited:

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.50/day)
Location
Reaching your left retina.
I have never said I am smarter than anyone else, when I am presented with clear logic that I can compare to what I know and can check out myself I accept it as fact until proven wrong.

FALSE. When you are presented with proofs you just cannot accept you are wrong. Maybe you should also close your eyes and start arguing that a HD5870 has more Gflops than a GTX580 and so it is faster, contrary to every reality, maybe that it has more raw texturing power and say it's faster again contrary to reality, but the thing is that one thing is theory and a different thing is ACTUAL performance. And actual performance of x87 vs SSE2 is 10-20% increase:

Our own measurements fully confirm Kanter's results*. However, the predicted performance increase from merely changing the compiler options is smaller than the headlines from SemiAccurate might indicate. Testing with the Bullet Benchmark only showed a difference of 10% to 20% between the x87- and SSE2-compiled files. This might seem like a big increase on paper, but in practice it’s rather marginal, especially if PhysX only runs on one CPU core. If the game wasn’t playable before, this little performance boost isn’t going to change much.

* In context that means that PhysX only uses x87, which is obvious because they later metion that performance increase is only up to 20% and not a 2x-4x increase, like Kanter theorized.

240 x 700 = 16,800,000,000 ops per second of single precision, assuming you could max out the core.
6 cores at two ops per cycle at 3.2Ghz= 38,400,000,000 ops per second.

Myyyyyy gooooood Lord! Now you don't even know basic math. :laugh:

240 x 700 Mhz = 168,000,000,000 ops/s

6 cores at two ops per cycle at 3.2Ghz= 38,400,000,000 ops per second.

That would be 168 vs 38 Gflops GENIUS. Only that it's not real, let's get deeper into reality:

GPU cores can do MADD or FMA, both of which can do 2 ops per cycle (i.e Cypress and Fermi), but GT200 could do MADD + MUL = 3 ops/cycle so:

240 x 700 Mhz x 2 = 336 Gflops
240 x 700 Mhz x 3 = 504 Gflops

6 x 2 x 3.2 Ghz = 38.4 Gflops except SSE2 can actually do 4 ops/cycle optimally so:

6 x 4 x 3.2 Ghz = 76.8 Gflops

But let's get deeper into reality:

The shaders on Nvidia cards run at 2x the core clock don't they? :rolleyes:

On G92/GT200 shaders actually run at 2.5x or so, but let's only use 2x so you don't feel so embarrased :laugh:

240 x 1400 x 3 = 1008 Gflops ;)

1008 Gflops vs 76.8 gflops

NOW do you really want to narrow it down to theoretical numbers? I'm 100% sure you don't. Right? Right? :roll:

Conclusion, SSE2 is 10-20% faster than x87 when performing physics. PERIOD.
 
Last edited:

the54thvoid

Intoxicated Moderator
Staff member
Joined
Dec 14, 2009
Messages
12,461 (2.37/day)
Location
Glasgow - home of formal profanity
Processor Ryzen 7800X3D
Motherboard MSI MAG Mortar B650 (wifi)
Cooling be quiet! Dark Rock Pro 4
Memory 32GB Kingston Fury
Video Card(s) Gainward RTX4070ti
Storage Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s) LG 32" 165Hz 1440p GSYNC
Case Asus Prime AP201
Audio Device(s) On Board
Power Supply be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software W10
Is there a mod in the house to stop this spat getting even more personal? And can it be sent back on topic?

This is a thread about the 6000 series AMD cards - not for a Physx pissing match.

I've been infracted for far less than the insults you two are throwing at each other.

Let's summarise by saying this: GPU is very good at running Physx. A multi-core processor can also do physx reasonably. Which is better suited- Ben says GPU, Steevo says CPU.

I say the main problem is that both are stubborn bastards and wont back down so their fight gets even more denegrating.

Take your fight off this thread guys please?
 

Benetanegia

New Member
Joined
Sep 11, 2009
Messages
2,680 (0.50/day)
Location
Reaching your left retina.
Is there a mod in the house to stop this spat getting even more personal? And can it be sent back on topic?

This is a thread about the 6000 series AMD cards - not for a Physx pissing match.

I've been infracted for far less than the insults you two are throwing at each other.

Let's summarise by saying this: GPU is very good at running Physx. A multi-core processor can also do physx reasonably. Which is better suited- Ben says GPU, Steevo says CPU.

I say the main problem is that both are stubborn bastards and wont back down so their fight gets even more denegrating.

Take your fight off this thread guys please?

I know it went off hand, and sorry for that, but it's not off topic. We are not talking about PhysX, I am not at least. We are talking about physics which is not (exactly) the same. It's only that PhysX will always arise in these kind of discussion because it's the only GPU accelerated physics engine to date.

GPU physics (not PhysX) is relevant because some of us think that a GPU should be used to improve the gaming experience beyond just some more polys and pixels and we hope that stronger GPU architectures could enable this.

What people don't really get is that it's not even a contest between CPU and GPU: the reality is CPU vs CPU+GPU and that's why "GPU physics" are better. The fact that CPUs can do things that GPUs can't is always brought up, but that's why you also have a CPU. It's just that the CPU is best left for doing those especial tasks while the biggest chunk of the computing burden is left to the GPU.

When it comes to features and programability and how many things you can do on a physics engine a CPU is superior and will most probably always be. If you want to realistically simulate 10 rocks falling, or 20 water drops, the CPU is for you, but what's the point of that? If you want to simulate relaistically 1000 rocks or water drops there's no other option but to use GPU muscle to help the brains of the CPU, it's that simple.

I hope that with Cayman AMD starts to really and finally push for GPU physics if only to start making a environment where their APUs have an advantage over Intel CPUs. In order for AMD doing that, I hope that Cayman is better at that kind of GPGPU capabilities, so they are not afraid of promoting it in the fear that Nvidia rapes them, like it's been the case until now. Here's where GPu physics are relevant to HD6000, at least to me.
 
Last edited:
W

wahdangun

Guest
Is there a mod in the house to stop this spat getting even more personal? And can it be sent back on topic?

This is a thread about the 6000 series AMD cards - not for a Physx pissing match.

I've been infracted for far less than the insults you two are throwing at each other.

Let's summarise by saying this: GPU is very good at running Physx. A multi-core processor can also do physx reasonably. Which is better suited- Ben says GPU, Steevo says CPU.

I say the main problem is that both are stubborn bastards and wont back down so their fight gets even more denegrating.

Take your fight off this thread guys please?

thank you, thats what i really want to said, i mean CPU was powerfull enough especially for today physix (if it was coded properly)

back on topic, where is the new info, its have been a week whiteout new leak
 

the54thvoid

Intoxicated Moderator
Staff member
Joined
Dec 14, 2009
Messages
12,461 (2.37/day)
Location
Glasgow - home of formal profanity
Processor Ryzen 7800X3D
Motherboard MSI MAG Mortar B650 (wifi)
Cooling be quiet! Dark Rock Pro 4
Memory 32GB Kingston Fury
Video Card(s) Gainward RTX4070ti
Storage Seagate FireCuda 530 M.2 1TB / Samsumg 960 Pro M.2 512Gb
Display(s) LG 32" 165Hz 1440p GSYNC
Case Asus Prime AP201
Audio Device(s) On Board
Power Supply be quiet! Pure POwer M12 850w Gold (ATX3.0)
Software W10
Joined
Nov 4, 2005
Messages
11,689 (1.73/day)
System Name Compy 386
Processor 7800X3D
Motherboard Asus
Cooling Air for now.....
Memory 64 GB DDR5 6400Mhz
Video Card(s) 7900XTX 310 Merc
Storage Samsung 990 2TB, 2 SP 2TB SSDs and over 10TB spinning
Display(s) 56" Samsung 4K HDR
Audio Device(s) ATI HDMI
Mouse Logitech MX518
Keyboard Razer
Software A lot.
Benchmark Scores Its fast. Enough.
FALSE. When you are presented with proofs you just cannot accept you are wrong. Maybe you should also close your eyes and start arguing that a HD5870 has more Gflops than a GTX580 and so it is faster, contrary to every reality, maybe that it has more raw texturing power and say it's faster again contrary to reality, but the thing is that one thing is theory and a different thing is ACTUAL performance. And actual performance of x87 vs SSE2 is 10-20% increase:



* In context that means that PhysX only uses x87, which is obvious because they later metion that performance increase is only up to 20% and not a 2x-4x increase, like Kanter theorized.



Myyyyyy gooooood Lord! Now you don't even know basic math. :laugh:

240 x 700 Mhz = 168,000,000,000 ops/s

6 cores at two ops per cycle at 3.2Ghz= 38,400,000,000 ops per second.

That would be 168 vs 38 Gflops GENIUS. Only that it's not real, let's get deeper into reality:

GPU cores can do MADD or FMA, both of which can do 2 ops per cycle (i.e Cypress and Fermi), but GT200 could do MADD + MUL = 3 ops/cycle so:

240 x 700 Mhz x 2 = 336 Gflops
240 x 700 Mhz x 3 = 504 Gflops

6 x 2 x 3.2 Ghz = 38.4 Gflops except SSE2 can actually do 4 ops/cycle optimally so:

6 x 4 x 3.2 Ghz = 76.8 Gflops

But let's get deeper into reality:

The shaders on Nvidia cards run at 2x the core clock don't they? :rolleyes:

On G92/GT200 shaders actually run at 2.5x or so, but let's only use 2x so you don't feel so embarrased :laugh:

240 x 1400 x 3 = 1008 Gflops ;)

1008 Gflops vs 76.8 gflops

NOW do you really want to narrow it down to theoretical numbers? I'm 100% sure you don't. Right? Right? :roll:

Conclusion, SSE2 is 10-20% faster than x87 when performing physics. PERIOD.

So where in there is the game being run? With eye candy?

Anyway thank you for the clarification of the processing capabilities, I did make a math error last night, and I never claimed to be a genius. For the underutilized CPU shown when a GPU is processing the physx workload it seems a waste to most. I understand the constraints of programming for either, or both.


Relax man. The new DX11 cards all support physics in one way or another. I just don't know if I will ever buy a game purely for the physics engine. So I guess what I really meant was we are wasting resources we have paid for by not utilizing the CPU to its full while maximizing the GPU.

* The CPU-based PhysX mode mostly uses only the older x87 instruction set instead of SSE2.
* Testing other compilations in the Bullet benchmark shows only a maximum performance increase of 10% to 20% when using SSE2.
* The optimization performance gains would thus only be marginal in a purely single-core application.
* Contrary to many reports, CPU-based PhysX supports multi-threading.
* There are scenarios in which PhysX is better on the CPU than the GPU.
* A game like Metro 2033 shows that CPU-based PhysX could be quite competitive.

They still show that with multithreading, and the SSE2 it would be just as competitive as using a low end card unless you get to using insane amounts of calculations, and even Nvidia recommends precooking a scenes physx to eliminate bottlenecks.
 
Last edited:

cadaveca

My name is Dave
Joined
Apr 10, 2006
Messages
17,232 (2.61/day)
thank you, thats what i really want to said, i mean CPU was powerfull enough especially for today physix (if it was coded properly)


Then we'd already have "realistic" physics in games, and we do not. NOT ONE TITLE.

And not even Cayman can make this possible.

:shadedshu

And no offense intended here, but you remind me of Bill Gates saying we'd never need more than 640k of ram. We know how that worked out...his vision of what was needed was not grand enough to encompass the full functionality we have today.
 

bear jesus

New Member
Joined
Aug 12, 2010
Messages
1,534 (0.31/day)
Location
Britland
System Name Gaming temp// HTPC
Processor AMD A6 5400k // A4 5300
Motherboard ASRock FM2A75 PRO4// ASRock FM2A55M-DGS
Cooling Xigmatek HDT-D1284 // stock phenom II HSF
Memory 4GB 1600mhz corsair vengeance // 4GB 1600mhz corsair vengeance low profile
Storage 64gb sandisk pulse SSD and 500gb HDD // 500gb HDD
Display(s) acer 22" 1680x1050
Power Supply Seasonic G-450 // Corsair CXM 430W
Then we'd already have "realistic" physics in games, and we do not. NOT ONE TITLE.

And not even Cayman can make this possible.

:shadedshu

And no offense intended here, but you remind me of Bill Gates saying we'd never need more than 640k of ram. We know how that worked out...his vision of what was needed was not grand enough to encompass the full functionality we have today.

I think he was talking specifically about NVIDIA PhysX® not physics... i don't know :laugh: to be honest i kinda stopped paying much attention to the non 6xxx series specific posts.

I just want the damn 6970 to be out already so i can order one :D although some more leaks, faked slides, news or anything about it would be nice in the mean time :laugh:
 
Top