Comparison 2 AMD Processors

anti-liberal · Oct 3, 2006

Ok which processor is better. I have an aftermarket cooler. Artic Cooling 64 Pro so I can OC a little bit.

. At first I thought it was pretty weird that a 4000+ costs more than a 4200+ but then i looked at the L2 CACHE of the 4000+, it was at 2x1MB so I need some help plz. Don't say intel is better because QUAD CORE is comming out soon for AMD and yet again intel will be far behind again!

AMD x2 4200+ AM2 2.2ghz
http://www.newegg.com/Product/Product.asp?Item=N82E16819103747

AMD x2 4000+ AM2 2.0ghz
http://www.newegg.com/Product/Product.asp?Item=N82E16819103739

wazzledoozle · Oct 3, 2006

If you can spare the money, definitely the 4000+. The 2x1Mb L2 chips are no longer being made, and they clock great (San diego with a ddr2 controller).

AshenSugar · Oct 3, 2006

u wont see any diffrance from the extra cache, i would also consider grabbing a x2 3600+
http://www.xbitlabs.com/articles/cpu/print/athlon64-x2-3600.html

easly clockable on basic air past x2 5000+ perf

anti-liberal · Oct 3, 2006

So i won't see a difference between the two. Wouldn't the 4000+ x2 OC 2.2ghz be better than that 4200+ x2. I thought it would be about the same stock. I know they overclock easy

. All AMD processors are easy to OC

. I have a 3800+ AM2 and a 3500+ 939 and a Sempron 2.0ghz. I would like some more responses. I would like to stay around 200.00.

Ketxxx · Oct 3, 2006

though its more, i strongly recommend the 4400+ X2, it may cost more, but the extra cache can be quite useful.

Judas · Oct 3, 2006

I would go for this one AMD x2 4200+ AM2 2.2ghz you wont see much difference with the extra cache (as AshenSugar also stated)

AshenSugar · Oct 3, 2006

check amd's own perf charts, teh 1m chips dont rate any higher then the 512 chips, infact at times 512k are a little faster.

i pointed out the 3600+ because its a NEW core most you will find are 65watt or even 35watt none are over 65watt, they clock like banshees, in the states we are still waiting for widespred distro of them but the ausi's and oceana(sp) regons have them enmass already, very nice chips, most people with a decent board can get close to or higher then 3gz on air and check the review i posted, its 1 page long(a long page) nice charts and graphes.

and unlike that artical says the a64 3600+ is not a dual core sempron chip, amd has rightly determined that l2 cache over 256k has little effect on the VAST majority of user level apps, 512 being more then enought, they dumped the 1mb l2 chips because THERES NO ADVANTAGE TO THEM, they are replacing them with 512k versions then replacing the 512k versions with 256k versions, THIS IS A GOOD THING, these new chips had they come out a year ago would have been 250+ easly they are EE models, 65 watt or lower, at 2.6gz(easy overclock for anybody whos overclocked a64 b4) it is as fast as the 5000+ a chip thats like 5x the price ^^

the 3 people i know that have acctuly got these chips LOVE them, one replaced a balky 4400(buggy mem controler)with the 3600 and was quite shocked about the diffrance it overclocked far better at less heat, i tryed to get him to hook me up with it when the 440 came back from RMA but no luck LoL

blah food calling post at yall later.

ps the reasion a64 dont need large l2 cache is the lovely onboard memory controler, thanks to IBMC on a64 chips the ram can be used as "L3 cache" and comes in with the same latancy and perf as a real L3 cache

design diffrances make a64 supperior to intels offerings still, c2d is nice, but its not a smart design, its hackjob dual core(like all intel current multi cores it uses the fsb to let cores talk....bad bad bad, amd use on chip links for cores to talk, not an already stressed fsb thats dealing with cpu<>ram cpu<>devices<>ram )

jjnissanpatfan · Oct 3, 2006

I haven't owned a dual core but have had a 3200vince and a 3700sandy clocked the same at 2.9 and the sandy won!! So i'd say the extra cache helps.I tested with some games and 3dmark 05 in 3dmark there was a 500-600 point difference.And why would they make it if there wasnt a difference.

cdawall · Oct 3, 2006

well they clocked the same so that should say something^^^^^

magibeg · Oct 4, 2006

Well it really depends on what your doing. If your folding then the extra cache definately gives a slight performance boost but if your playing games i doubt if you'ld benifit at all. Also i may be wrong here but i thought the amd quad cores wont work in the current am2 slots? Kinda like how the conroe that fits into a 775 wont work in all 775 slots?

Alec§taar · Oct 4, 2006

Judas said:
I would go for this one AMD x2 4200+ AM2 2.2ghz you wont see much difference with the extra cache (as AshenSugar also stated)

Well, depending on application? You can...

SETI@Home is noted, as 1 example, of an application that gains largely thru larger L2 cache amounts.

Gaming, though? 2-3% increases, MAYBE... right??

APK

P.S.=> It all depends on what you're out to do & where/how, etc. of course... apk

AshenSugar · Oct 4, 2006

magibeg said:
Well it really depends on what your doing. If your folding then the extra cache definately gives a slight performance boost but if your playing games i doubt if you'ld benifit at all. Also i may be wrong here but i thought the amd quad cores wont work in the current am2 slots? Kinda like how the conroe that fits into a 775 wont work in all 775 slots?

am2 supports all am3 chips, amd is making the chips backwords compatable, am3 chips will support ddr3 as well as current ddr2 just not on the same mobo(though i bet we see hybrid boards like the k7s5a where you can use one or the other)

i will get a link togather for you, just please dont let anybody report it as spam, tired of getting MOD/ADMIN pm's about people reporting links i give as spam.

AshenSugar · Oct 4, 2006

jjnissanpatfan said:
I haven't owned a dual core but have had a 3200vince and a 3700sandy clocked the same at 2.9 and the sandy won!! So i'd say the extra cache helps.I tested with some games and 3dmark 05 in 3dmark there was a 500-600 point difference.And why would they make it if there wasnt a difference.

you can get 500-600points from even a slite mod to memory timings, bios(diffrent bios revision), driver diffrances, the sandy also has a newer more efficent memory controler, and 3dmark is NOT A VALID WAY TO COMPARE CPU'S, or any other hardware for that matter, its so easy to change a few settings and gain ALOT of points, i use to tweak to get highest scores possable, but those tweaks ruin perf for other stuff or make real games look like crap.

im so tired of seeing "3dmark 0x said this so my XXXX is better" 3dmark is only really good for giving a ruff idea on perf for a system, good for checking if you have a major perf bottleneck and an easy way to test as you try and find/fix the bottleneck.

it is however a decent stab test if you can run it overnight as you sleep and your system is still stable the next day then you should be set for game stability in d3d(ogl is a diffrent story)

if you want to test you perf in games USE REAL GAMES like fear, farcry, quake4, pery, riddick, X, unreal XXXX, dont trust synthetic benche tools to give you an answer as to how powerfull your system really is they are SYNTHETIC not REAL.

superPI is a good test for system perf, it likes more cache, but it also deals well with a64 chips with small cache+low latancy ram.

i know this isnt gonna make sence to many of you but amd has EOL'd the 1mb cache a64 chips for a reasion, not just caust, but because THEY HAVE LITTLE TO NO PERF ADVANTAGE IN REAL WORLD APPS , games/office/photoshop and the like dont need large cache, infact 256k gives them PLENTY with an a64 chip.

from what im hearing amd is replacing thee 512k L2 x2 chips with 256k chips and the 1MB L2 core x2's with the 512k models, then we will see 128L2 sempy64 x2's(may not make it to market tho they are sposta be in the works.) and 64k sempy64's are on the way as well from what i been hearing, think duron, even with 64k cache they ran VERY well, amd designed them to run well with so little cache, same with the next gen sepy64's and a64's ,they are made to run with XXX cache more isnt needed.

L3 should be comming to a64 chips tho, it will be about the same speed as low latancy ddr2 with simlar bandwith, then the ram will take the rest of the load, under quad core we will see a shared L3 added to each core having its own l2 caches, at least thats from the leeked docs i have seen.

oh and Linkage for am2/am3 info

http://enthusiast.hardocp.com/article.html?art=MTE3NCwxLCxoZW50aHVzaWFzdA==

Upgrading to Barcelona

Many of us that are currently running socket 939 systems are out of luck on this one, but if you have a socket AM2 AMD system with DDR2 support, moving to a quad-core processor should be as simple as changing out the processor itself and moving on. This could obviously impact your next upgrade path. For their upcoming 4X4 enthusiast platform, AMD has said that you will be able to install two dual-core processors before Christmas and then upgrade to a dual quad-processor system by the middle of next year. Also, as many folks have invested in NVIDIA SLI motherboards, but initially only populated one X16 slot with a video card and left the other open to upgrade later, you will be able to do the same with an AMD 4X4 platform. 4X4 will surely support “pay as you grow,” and I doubt you’ll see AMD keep the platform in the exclusive flagship segment. AMD is realizing that the computer enthusiast is not necessarily the guy with the biggest wallet. (I think they lost sight of this for a while, but that’s another editorial.)

This easy upgradeability will really hit home in the enterprise segment. Phil Hester explained to me that if you have any new, socket AM2 servers you should be able to upgrade them to quad-core processors. For anyone that worries about how many processors they have “per U,” this is looking to be a great thing. Given what AMD has said, a socket AM2 dual-processor or dual dual-core processor system should be able to easily and quickly move into the dual quad-core realm . By pushing up the cores while keeping the power envelopes the same, AMD will be opening doors to tremendous processing power where it simply cannot be implemented currently without a huge enterprise budget.

that should answer the questions about quadcore on am2

simple as swaping out a chip ^^

Alec most people dont build a system to run seti@home, and the cache diffrances are far less notably on a64 then on netburst(p4/pd) where many people get the assumption that they need LARGE caches, i have seen people who want amd to put 4 or even 8mb L2 on their chips, dispite the fact that it wouldnt do any good for apps people USE unless they like to render alot with say 2dsmax then it would help a bit but nothing like they would hope specly for the caust they would incure buying such a chip

and "apk" ?

Alec§taar · Oct 4, 2006

LONG READ, I suggest you read thru it & "drink it in & digest it" as to what apps gain where and HOW (what is required by developers to leverage L2 cache, and yes, multithreaded application) & I cannot make it any smaller to read, not w/out omitting critical details from quotes (from RELIABLE sources):

AshenSugar said:
Alec most people dont build a system to run seti@home

That's just a single example. Folding@Home's another. There are more. Especially multithreaded applications, here is what is required for a gain to occur, from AMD:

Living in a Multi-Core World: Tips for Developers

http://developer.amd.com/articlex.jsp?id=28

"Benefits of a Separate L2 Cache

One of the potential bottlenecks in a dual-core configuration comes from the L2 cache. In a single-cache configuration, you get a performance hit when multiple threads are competing over the same data cache. Having a separate L2 cache for each core gives you twice the cache benefit. And of course, if you have four cores, each with its own cache, that's four times the benefit.

Having these dual L2 caches gives AMD's 64-bit architecture, also known as Direct Connect Architecture (DCA) one of its key distinctions over its competitors. But simply having this architecture in place only takes you so far. To really benefit from L2 cache separation, developers need to implement threading techniques that allow separate cores to process separate data sets, limiting cache contention and coherency problems.

For example, consider "functional threading": the first thread handles one distinct process, then passes the operation to the second thread in the pipeline, which is dependent on that data... then to the third, the fourth, and so on. While these operations can try to run in parallel, ultimately gains are limited because of contention over the same data cache.

But with "data parallel threading," you would create threads that rely on independent data sets, for example dividing a video frame into two halves. This allows concurrent threads to make full use of an individuated cache-core configuration. Also, coding your apps with an emphasis on parallel threading allows you to automatically scale up as processors begin to add even more cores to the die.

What Is NUMA?

Along the lines of an independent L2 cache, the AMD64 architecture also employs NUMA, Non-Uniform Memory Access (or Architecture, depending on who you ask). In this scenario, each processor socket has its own memory controller, shared by all cores on that processor, which is typically populated by the system with actual physical memory. For AMD, this especially becomes important when a configuration consists of multiple multi-core processors.

For each core, some memory is directly attached, yielding a lower latency, while some is not directly attached and has a resulting higher latency. When a given thread begins processing, the OS looks at which core is running the thread and allocates physical memory to that process. This way, the data stays close to the thread that needs it, a process called "memory affinity."

The OS considers this core to be the "home processor" for the thread and tries to keep the thread running on it. This "thread affinity" or "process affinity" contributes to performance by keeping the thread from unnecessarily getting moved. Each time the thread moves over to another core, performance takes a slight hit.

--------------------------------------------------------------

* APK EDIT - this is where Win32 API function calls like SetProcessAffinity (good for single thread apps imo more) OR SetThreadAffinityMask (better for multithreaded apps on this account), help!

Especially when SPECIFICALLY TRYING TO FULLY "HAND-OPTIMIZE" AN APPLICATION FOR MULTIPLE THREAD DESIGN, EXPLICITLY (not just letting the OS handle the multiple threadwork, implicitly).

* More on this, in detail, below later, as regards dataset cache blocking...

(It helps to stop a phenomenon known as "cache pollution" & I did a thread on that here before, look it up if necessary)

--------------------------------------------------------------

Considering that memory is the source of the most data traffic on the computer, even more than IO, this setup increases memory bandwidth at a ratio effectively the same as the number of cores. So a 4-socket server will have 4 times the memory bandwidth.

What This Means for Developers

If writing native code, do a memory allocation request for each thread. The OS will see this and handle the allocation by assigning memory in the physical bank attached to the processor on which that thread is running."

CACHE BLOCKING:

ALSO, this can help - a technique known as "Cache Blocking", see here:

http://www3.intel.com/cd/software/products/asmo-na/eng/20461.htm

Cache Blocking Technique

There are many factors that impact cache performance. Effective use of data cache locality is one such significant factor. And the well known data cache blocking technique is used to take advantage of data cache locality. The cache blocking technique restructures loops with frequent iterations over large data arrays by sub-dividing the large array into smaller blocks, or tiles. Each data element in the array is reused within the data block, such that the block of data fits within the data cache, before operating on the next block or tile.

Depending on the application, a cache data blocking technique is very effective. It is widely used in linear algebra and is a common transformation applied by compilers and application programmers. Since the 2nd level unified cache contains instructions as well as data, compilers often try to take advantage of instruction locality by grouping related blocks of instructions close together as well. Typical applications benefiting from cache data blocking are image or video applications where the image can be processed on smaller portions of the total image or video frame. But the effectiveness of the technique is highly dependent on the data block size, the processor's cache size, and the number of times the data is reused.

By way of example, a sample application is provided to demonstrate the performance impact of this technique (see Appendix A). Figure 2 shows the results of cache blocking with varying block sizes on the sample application. At the sweet spot around 450-460 KB tiles size matches very closely with unified L2 cache size, the application almost doubles in performance. This is only an example and the block size sweet spot for any given application will vary based on how much of the L2 cache is used by other cached data within the application as well as cached instructions from the application. Typically, an application should target the block size to be approximately one-half to three-quarters of the cache size. In general, it's better to err on the side of having too small of a block size than too large.

Additionally, the data cache blocking technique performance scales well with multiple processors if the algorithm is threaded for data decomposition. Fortunately, the fact that each block of data can be processed independently with respect to other blocks lends itself to being decomposed into separate blocks which can be processed in separate threads of execution. Figure 2.0 also shows the performance improvement of the cache blocking algorithm for two threads running on a dual processor system with two physical processors. The performance curve for two threads matches very closely the performance curve for a single processor system with the sweet spot for the block size at around 450-460 KB per thread but at approximately twice the performance. Assuming there is very little synchronization necessary between the two threads as in this example, it's reasonable to expect that the block size sweet spot would not vary significantly. Both processors have independent cache of equal size. In this case, both processors have 512KB of L2 cache available.

Since the threaded data cache blocking technique can provide significant performance opportunities on multi-processor systems, applications should detect the number of processors in order to dynamically create a thread pool so that the number of threads available for the cache blocking technique matches the number of processors. On Hyper-Threading Technology-enabled Intel processors, the application should detect the number of logical processors supported across all physical processors in the system. Be aware that a minimum block size should be established such that the overhead of threading and synchronization does not exceed the benefit from threading your algorithm.

Actual performance for a given application depends on the relative impact of L2 cache misses and their associated memory latencies induced without cache blocking. For an application that has significant execution time relative to memory latencies, the performance impact will also be reduced."

AshenSugar said:
and "apk" ?

If you're asking what that is, it is my initials.

APK

P.S.=> Using multithreaded apps w/ larger L2 cache can show gains, as an example from AMD notes above, for cache coherency & contention on threads (which are called "race conditions" & I have noted it here before in fact on these forums) which are the more prevalent application type out there today, multithreaded ones & you can even check this yourself on YOUR system.

(AND, a good 90-100% of what you're running nowadays, visible via taskmgr.exe & its PROCESSES tab with the THREADS column visible will show you this)...

AshenSugar said:
games/office/photoshop and the like dont need large cache, infact 256k gives them PLENTY with an a64 chip.

Incorrect on 1 of them (they ALL get gains, gaming shows the least) but photoshop can though, but not as large but, can be larger via using EXPLICIT multithreaded design techniques for L2 cache usage as noted above!

Photoshop iirc, already is designed with EXPLICIT SMP OPTIMIZATIONS (not just implicit multiple thread use driven via the OS only)...

However, business apps tend to get gains... they are part of an ENTIRE CLASS of apps that gain via larger L2 cache levels.

Business apps will gain (i.e.-> Anything that repetitively uses the same instructions over & over on data)... and, so does what I do: CODING.

During Linux kernel core recompiles, for example (I like to put those up, not just my own words), you can see it here:

http://www.linuxhardware.org/article.pl?sid=01/06/11/1847213&mode=thread

"It would seem that compilation is very L2 heavy. Further proof of this is that the overclocked Duron was unable to beat the 850 Athlon. Still, the difference is not large enough to be excessive."

Apps like photo processing won't though as much (UNLESS things like splitting frames in 1/2 is used as noted above from AMD & also from INTEL above) but, still do:

This website gives a good overview of level 2 cache

http://www.karbosguide.com/books/pcarchitecture/chapter11.htm

"Level 2 cache is most important for processor intensive applications
such as distributed computing. Video editing, 3d studio max or sound conversions are also very processor intensive applications."

L1 and L2 cache are important components in modern processor design. The cache is crucial for the utilisation of the high clock frequencies which modern process technology allows. Modern L1 caches are extremely effective.

In about 96-98% of cases, the processor can find the data and instructions it needs in the cache. In the future, we can expect to keep seeing CPU’s with larger L2 caches and more advanced memory management. As this is the way forward if we want to achieve more effective utilisation of the CPU’s clock ticks. Here is a concrete example:

In January 2002 Intel released a new version of their top processor, the Pentium 4 (with the codename, “Northwood”). The clock frequency had been increased by 10%, so one might expect a 10% improvement in performance.

But because the integrated L2 cache was also doubled from 256 to 512 KB, the gain was found to be all of 30%.

Fig. 79. Because of the larger L2 cache, performance increased significantly.

In 2002 AMD updated the Athlon processor with the new ”Barton” core. Here the L2 cache was also doubled from 256 to 512 KB in some models. In 2004 Intel came with the “Prescott” core with 1024 KB L2 cache, which is the same size as in AMD’s Athlon 64 processors. Some Extreme Editions of Pentium 4 even uses 2 MB of L2 cache."

MORE REINFORCING EXAMPLES & CASES OF WHERE L2 CACHE SHOWS GAINS/BENEFITS:

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2795&p=4

A small paragraph from their page

"The 4MB L2 cache can increase performance by as much as 10% in some
situations.

Such a performance improvement is definitely tangible, and
as applications grow larger in their working data sets then the
advantage of a larger cache will only become more visible.

If you're the type to upgrade often, then the extra cache is not worth
it as you're not getting enough of a present day increase in
performance to justify the added cost.

However, if this processor will be the basis for your system for the next several years, we'd strongly recommend picking a 4MB flavor of Core 2."

I believe though that if you follow these rules of thumb and things to
think about you should be ok.

1 Can you afford and justify the extra cost of the CPU?

2 What will you be doing with the PC? Office, solitare, email, web
browsing and other non CPU intensive applications - E6400.
Gaming, sound editing, picture editing, video encoding and other CPU
intensive applications - E6600 (you have a longer shelf life with this
CPU too)."

Apps with only TINY gains? Games (but, gains nonetheless result here too) typically... so, again: It really ALL boils down to what YOU DO on a PC... as I stated initially.

A lot to read, no doubt, but it all shows when/where/how applications can be coded to leverage added L2 cache... apk

cdawall · Oct 4, 2006

i refuse to read long post but last time i checked the lower the cache the higher the clock(just as rule of thumb) which = more performance but to cpus w/ the same clocks and one w/ double to quad the cache the larger cache size will preform minutly better in games and sig. better in vid editing and other mem heavy activities

Alec§taar · Oct 5, 2006

cdawall said:
i refuse to read long post

Oh, I don't blame you, in a way... I am not a HUGE fan of them either, unless it's a topic I am 'getting into' w/ someone.

However, in my last one above?

Well, there is a LOT of *pertinent* detail in that last one I did in reply to AshenSugar, that omitting would 'harm' it (imo @ least).

ANOTHER PROOF WHERE MULTIPLE THREADS HELPS GAMING (by the way, courtesy of zekhraminator our news guy):

http://forums.techpowerup.com/editpost.php?do=editpost&p=162286

This time? In the area of "physics processing", no less!

APK

P.S.=> Ashen Sugar & I have had relatively "long" discussions here before anyhow, so I am sure he will deal w/ this one like others he & I had (though he is new here? We had some interesting ones already imo!)... apk

AshenSugar · Oct 5, 2006

In January 2002 Intel released a new version of their top processor, the Pentium 4 (with the codename, “Northwood”). The clock frequency had been increased by 10%, so one might expect a 10% improvement in performance.

But because the integrated L2 cache was also doubled from 256 to 512 KB, the gain was found to be all of 30%.

p4/pd, anything netburst NEEDS large cache to makeup for overall poor design in the core itself that make cache misses and such incure a huge penilty(as much as 47% froms some documents i have seen)
barton didnt perform notably better then tbred-b with 1/2 the cache, some apps did get gains but nothing drastic.
going from 128k to 256k had a larger effect tho, but again due to athlon design(athlon revers to the whole family) cache dosnt have the MASSIVE perf improvment that it has with intel's nutburst(netburst/nutburst=pent4/d class cores)
netbuest has EXTREAMLY long pipes, doing alot less work per clock then athlon, if there is a miss caculation/bad branch prediction or cache miss the pipe has to clear and grab the info from cache, more cache means that data can be stored in it for longer, thats why intel adds more and more cache to many of there chips, now core2 dont have the crappy nutburst design but they also are still hammpered by chipset based memory controlers that force ALL data to travil over the FSB, so more cache helps makeup for any bottlenecks the chipsets/boards that may accure.

your post was good, but its hard to compair amd and intel designs, and sometimes you dont know whitch one people are refering to or basing your oppenions on.

i have tested multi core 512k and 1mb cache chips side by side with MANY apps, i have never seen more then a 5% perf diffrance even in 3dsmax, but heres the rub, the test systems had LOW latancy ram that could deal with decent clocks, the diffrance rendering a test item with 3dsmax was about 5% diffrant.
Now heres the 2nd rub, the 1mb cache chip wouldnt clock as high as teh 512k chip(same stock clock of 2.2gz) dispite it being an optiron core insted of x2, i was able to get 2.8 on the 3800+(512k cache) the optiron(cant remmber number) only hit 2.7 at best, tested overclocked the 3800 was about 1% faster(no biggie at all)

now i also tested 128k, 256k, 512k, 1mb core 754 chips, all in the same board, all with the same memory timings,(same low latancy ram at 2-2-2-11(10 or 11 is best for 754 and some 939 cores even,) at t1) the 128k cache chip suffered alot in some tests, such as 3dmark,3dsmax(way to slow imho), glexcess, aquamark, and some buisness app bench i did.
also tested some games.

the 256k,512k and 1mb chips all at the same clock settings where within 3% of eachother in everything but 3dsm, there it was aprox 7% diffrance(tho who would buy a low end chip for running 3dsmax?) but again the rub is that the lowest cache chips overclocked the best, we got the 128k 1.8gz chip up to 3.2gz using a freezer64pro, highest clock of all the chips, the 256k hit 2.8 at best, the 512k (e6 venice) hit 2.75 , the 1mb was my 3700+ch chip it maxed at 2.6gz, the 128k chip made up for the lack of cache easly due to clock, it was acctuly a little faster then the 3700@2.6, that even supprised me knowing how cache dependant 3dsmax is, the other chips came in around 2% diffrant the 256k chip getting only about 1% lower then the 512k and 1mb chips when overclocked, did this testing with 754 today at my buddy tims place, as he has sempy64 and e6 venice chips in his backup/video boxes(starts videos converting to h264 and leaves them be) board used was chaintech vnf3-250, ram was tccd 1gb kit from samsung.
he let me test with my board because i knew aprox what to expect, i used 3dmark only because he wanted to see the diffrance, it was all of 400-700 points after 5 loops per chip.

i would have posted numbers but he only gave me the %'s after testing(hes got the numbers on his main system, an x2 3600+(hey he got it for 120bucks OEM, cant beet that for the 2.99gz clock its getting with a freezer64pro(his board wont give him the extra 1mhz to get 3gz due to a bios bug, frustrating, its set at 300x10 and the true clocks is 299mhz x10

the x2's i tested where a while back at another friends work when he had me there rebuilding his systems as we reinstalled windows on 88boxes that where hit with a virus that destroyed the os totaly(had to lowlevel the drives to get rid of it...not good) he also had a couple other 939 systems there(4 systems we rebuilt for him in all) all same parts other then cpu, same drivers/software/settings(infact windows was cloned from 1 drive to the other using mirror raid mode, saved me time setting up windows and such on 2 main systems)
honestlyk i thought the 1mb cache chips would be alot better then the 512k and especly 256 and 128k chips, the 128 at stock was slowest but still a good deal for the oc it gave, the 256 and 512 where pretty close, infact for everyday apps i would bet nobody here could tell the diffrances PERIOD!!!

the 1mb cache chips is also the oldest core, its the most picky about settings and worst overclocker, but in 3dsmax its the fastest till u go up against a 3.2gz sempy64(128k) then its supprisingly slower....ftw.....

blah, what im trying to say is that yes more cache can help, but not in most apps people use, the office benches didn show much diffrance, the photo minipulation tests showed some gain with cache, tho it was nothing drastic.

i know long posts can seem daunting, hell at least i try and cut them up a bit.

i would rather buy an x2 3600+ now, oc to the 3gz range, save the extra i would have spent on a 512k or 1mb chip to put toward a barcalona(quadcore) or at least a 2nd gen 65nm chip that will KILL current cores nomatter what cache they come with.

alec also take into account that k8l/quadcore will have l1(per core), l2(per core) and l3(shared), i cant wait to see those beasts

for now i would grab cheap and wait and see what 65nm brings(due by the end of the year) and mid next year=k8l/barcalona, i think any dual core you get now will hold you till the next big chip comes out or longer easly

Alec§taar · Oct 5, 2006

AshenSugar said:
netbuest has EXTREAMLY long pipes, doing alot less work per clock then athlon, if there is a miss caculation/bad branch prediction or cache miss the pipe has to clear and grab the info from cache, more cache means that data can be stored in it for longer, thats why intel adds more and more cache to many of there chips

Actually/iirc, pipeline stalls/bubbles are a LARGE part of why INTEL pursued much higher "mhz-rates"!

Certainly HIGHER ONES, mhz-wise, than AMD!

(AMD, from what I have seen in benchmarks tests? AMD CPU's use a heck of a lot less clock cycles to get things done just as fast & for years now? Faster!)...

The use of high-mhz by Intel, afaik? Is largely to allow for faster recovery time from things like pipeline stalls/bubbles, regardless of "out-of-order" executions possible.

AshenSugar said:
honestlyk i thought the 1mb cache chips would be alot better then the 512k and especly 256 and 128k chips, the 128 at stock was slowest but still a good deal for the oc it gave, the 256 and 512 where pretty close, infact for everyday apps i would bet nobody here could tell the diffrances PERIOD!!!

Depends, again, on the application: My init. premise!

AshenSugar said:
blah, what im trying to say is that yes more cache can help

* Ah, the MAIN POINT @ last!

AshenSugar said:
but not in most apps people use

Depends on your defintion of "most people"...

E.G.-> I know that MS Office is widely used on the job (been everywhere I have seen in this field professionally & in academia since 1992 or so, beginnings of the "PC-Era" & prior to widespread internet usage "by the masses" really)!

Office is probably more than any other wares out there, possibly, & I would wager more than games are, any single game quite possibly.

Office? Gains by it largely, according to the sites above...

AshenSugar said:
the office benches didn show much diffrance, the photo minipulation tests showed some gain with cache, tho it was nothing drastic.

Well, the URL's I refer to from AMD, Intel, Anandtech, LinuxHardware.org forums (for the increase present in codewriting due to this) say differently...

HOWEVER, granted/perhaps, only on benchmarks tests!

Such as those 'synthetically emulated' & present in (iirc) OfficeMark, vs. real Office & not "human senses"...

BUT, if I saw a SETI@Home unit finish say, 30-45 minutes faster? Well, I would be able to tell... lol!

AshenSugar said:
alec also take into account that k8l/quadcore will have l1(per core), l2(per core) and l3(shared)

Sounds good, I wasn't aware of the use of "L3" cache... the last time I saw that? I owned an AMD K63 @450mhz, that had that on an ASUS (what I run when I go/do AMD) Super-7 mobo... solid stuff for its day circa 1998.

AshenSugar said:
i cant wait to see those beasts

We will soon enough: as for me? Time to "Crash" out for the evening...

APK

AshenSugar · Oct 5, 2006

some benches show gains and other show losses, depends on the way it tests, one we used acctualy uses real msoffice or wordperf suit OR *beta* openoffice 2.x testing, it tests some common uses, the diffrance was minimal, within the margen of error even according to that app(trying to get tim to send me the apps name)

and the k6-3 could have been so big had amd acctuly made them readly avalable at the time.....*grummble*

i would avoid the 64k sempy64's that may come out unless they change between now and the time they acctualy show up, my buddy got ahold of one(dont ask where, he wouldnt even tell me) but its slow at stock, he hasnt really overclocked it yet, but the speed could change between an early ES version and the full version, early duron test chips where far slower then the versions we saw on the market

Alec§taar · Oct 5, 2006

AshenSugar said:
some benches show gains and other show losses, depends on the way it tests, one we used acctualy uses real msoffice or wordperf suit OR *beta* openoffice 2.x testing, it tests some common uses, the diffrance was minimal, within the margen of error even according to that app(trying to get tim to send me the apps name)

Well, per the earlier cited anandtech article & the quote below from another one (Business WinStone (All office stuff, synthetic emulation & L2 cache bearing test results)):

http://www.anandtech.com/showdoc.aspx?i=2648&p=6

"With Business Winstone, as we saw in the first article, the Pentium M's 10-cycle L2 cache is able to give it the top position in this test"

L2 cache helps, with Office apps!

(Especially IF you use the programming API tricks for cache locality right (see the handoptimizing tricks possible above in fact noted from Intel & AMD regarding cache locality earlier I cited)).

The first article from AnandTech (stating as application's datasets to process grow)?

Well, one I have personally seen??

Ever seen Ms-Access run imports from an As/400 DB-2 database engine, over Netware, into Windows??

It's datasets can be literally, midrange size, in cases of Fortune 500 companies... this I know for a fact. When I built this, ages ago (prior to Stored Procs being commonly used)? It took 1/2 a day to run & finish... & it was VBA code, not macros.

Often repeating code in SQL Requests & in loops!

Per Anandtech above (& they're right) Repeated code gains being run in an L2 cache, especially 1mb one (they usually aren't THAT big)... point-blank.

I've seen them run way, WAY into the millions of runs... millions upon millions of rows of returned data, being filtered thru.

Rampant, if DIRECT (not SQL Stored Procedures stored + run on Server that is for processing) & I saw them knock (over NOVELL Netware) Os400 terminal emulators like RUMBA &/or IBM Client-Access right off the LAN, lol!

This is why it's a "Stored Procedures" world now - less network traffic/chatter!

(& especially w/ HUGE datasets returned via loops in VB or VBA)...

Office, if used in "industrial" environs? Can do the job, but @ a cost sometimes... it can be extremely CPU, datastorage in the .mdb file size increases, & VERY CPU data processing intensive.

Looking back on it, per the example I note above?

Back then, I could have used the API to REALLY "split it up" into threadwork for some things, & gained cycles free on cpu #1, by doing other things on CPU #2... & cache locality?

Would be the same... I could have used stuff like that, but wasn't then aware of it. DECLARE statements in modules would open it up as a possible.

This was before you could PC access stored procs on DB2, back in 1994-1995. BUT, even now? If you didn't use Stored Procs, you'd chomp on the network w/ constant SQL chatter over IP to the DB storage system (doesn't matter which one - SQLServer, Oracle, DB/2, Informix, you-name-it) just as I describe it... even today.

I've seen what tossing Terminal Server, or Citrix, into the mix here can cause too, even more hassles, even WITH Stored Procs in the mix, but coding to it w/ VB6... more API work circumvents it, but other problems.

APK

AshenSugar · Oct 5, 2006

access isnt part of office anymore last i checked

they stared selling it as a sepret app, and most people dont do database work at home for large companys, they use their work pc thats made spicificly for that.

sql, another server/pro level app most home users arent gonna even know about.

im talking about stuff people use at home on a regular basis, i agree some prof apps can make use of larger L2, i know 2 people that use access at home, thats of hundreds of people i know, and citrix=yucky, hate dealing with it, specly when crossing os's, my fathers work has it on 98 boxes(OLD p3-400 dells) it regularly stalls out or cuts off, from what the tech departmet says the citrix server gets overloaded and freaks out, stupid crap, never had that problem with ibm's remote desktop app......

Alec§taar · Oct 5, 2006

AshenSugar said:
access isnt part of office anymore last i checked

It's still in Office 2003, the last model released as final, in Premium/Professional edition.

In fact, see here @ MS:

http://www.microsoft.com/office/system/overview.mspx#EUB

(It is, imo @ least? The MOST POWERFUL app in the Office lot.)

AshenSugar said:
they stared selling it as a sepret app, and most people dont do database work at home for large companys, they use their work pc thats made spicificly for that.

They all sell separately. I've seen that in CompUSA to name 1 place that sells Office as a WHOLE, or in separate app packages.

AshenSugar said:
sql, another server/pro level app most home users arent gonna even know about.

Well, it's possible that folks haven't @ least heard of SQLServer... but I would think not.

MS puts out a LOT of "p.r." in trade mags about it in adverts, etc. so they will have @ least heard of it @ some point in their forays into this field I would think @ least, but, opinions vary.

AshenSugar said:
im talking about stuff people use at home on a regular basis, i agree some prof apps can make use of larger L2, i know 2 people that use access at home, thats of hundreds of people i know

Plenty of folks use Access!

I would be another & DanTheBanjoMan's yet another for example, from folks here.

(I know he does because he asked questions about it here).

There are thousands of developers out there, & thousands UPON thousands of office workers - I would wager they use Access, or have @ some point.

AshenSugar said:
and citrix=yucky, hate dealing with it, specly when crossing os's, my fathers work has it on 98 boxes(OLD p3-400 dells) it regularly stalls out or cuts off, from what the tech departmet says the citrix server gets overloaded and freaks out, stupid crap, never had that problem with ibm's remote desktop app......

Citrix & Terminal Server are good in 2 situations, imo @ least, & 1 I do not like but it works (2nd one):

1.) When a company doesn't want to spend on multiple T1 lines (which cost) to communicate to remote locations/campuses.

&

2.) So the network engineers/administrators have a UNIFORM DESKTOP across client boxes on their network...

* In other words: So the server administrator has TOTAL CONTROL over what users can do OR install (the more important part).

The "remote desktop client" folks use to work from home (I do fairly extensively on jobs)? It's based on Terminal Server stuff, which in turn, is based on Citrix code: Port 3389 access.

APK

P.S.=> Citrix & TS, are 'touchy' if programs running on them have middleware (and most do from what I have seen OR built for data access @ least, which is hugely used in industrial environs/workplace) & that is the "problem", because that middleware's usually built for speed along w/ how Citrix/TS manage memory (like SETS work you do as a kid in elementary school almost, shared memory, w/ overlaps), & if Windows std. memmgt. is confusing? This makes it even MORESO...

BUT, those middleware drivers, are built with little to NO timeslicing I think, & this is what causes the contentions...

I say that, because SLEEP API calls in their grid & report populating loops is how I solved the problem above:

TOO MUCH CONTENTION on the network from the SINGLE TS/CITRIX connection to too many client "desktops" (actually all the same desktop, diff. user sessions making demands of it via a middleware (such as ODBC, DAO, RDO, ADO, & ADO.NET or Oracle's OO40 for instance), & that middleware drives up the CPU usage on the server from that single user desktop shared session all users really use)... apk

Processor	X2 3800+ @ 2.3 GHz
Motherboard	DFI Lanparty SLI-DR
Cooling	Zalman CNPS 9500 LED
Memory	2x1 Gb OCZ Plat. @ 3-3-2-8-1t 460 MHz
Video Card(s)	HIS IceQ 4670 512Mb
Storage	640Gb & 160Gb western digital sata drives
Display(s)	Hanns G 19" widescreen LCD w/ DVI 5ms
Case	Thermaltake Soprano
Audio Device(s)	Audigy 2 softmod@Audigy 4, Logitech X-530 5:1
Power Supply	Coolermaster eXtreme Power Plus 500w
Software	XP Pro

Processor	Athlon64 3500+(2.2gz)@2.94gz(3.03gz)
Motherboard	Biostar Tforce550 (RMA) (m2n-sli delux)
Cooling	PIB cooler
Memory	2gb ocz 533 +1gb samsung 533 4-4-4-12
Video Card(s)	x1900xtx 512mb+zalman vf900 cooler(kicks stock coolers arse)
Storage	80gb,200gb,250gb,160gb
Display(s)	20.1 in dell 2001fp + KDS visual sensations 19"
Case	Codegen briza seirse
Audio Device(s)	ADI SoundMax HD audio onboard,using Ket's driver pack
Power Supply	FSP 400watt SAGA seirse w/noise killer
Software	Windows 2003 ent server as workstation(kills xp in perf and stab)

System Name	Ravens Talon
Processor	AMD R7 3700X @ 4.4GHz 1.3v
Motherboard	MSI X570 Tomahawk
Cooling	Modded 240mm Coolermaster Liquidmaster
Memory	2x16GB Klevv BoltX 3600MHz & custom timings
Video Card(s)	Powercolor 6800XT Red Devil
Storage	250GB Asgard SSD, 1TB Integral SSD, 2TB Seagate Barracuda
Display(s)	27" BenQ Mobiuz
Case	NZXT Phantom 530
Audio Device(s)	Asus Xonar DX 7.1 PCI-E
Power Supply	1000w Supernova
Software	Windows 10 x64
Benchmark Scores	Fast. I don't need epeen.

System Name	Beatle _ Juice
Processor	Ryzen 5 3600 X (oc 4,2 ghz) 1.325v
Motherboard	Asus Rog Strix B450 F
Cooling	Noctua NH-U14S (twin fans)
Memory	16 gb 3200 Mhz Kingston Hyper X (CL15-18-18)
Video Card(s)	ASUS Nvidia Cerberus Gtx 1070ti
Storage	Samsung 860 EVO 250 GB / Samsung F3 1TB / Kingston A400 240 GB
Display(s)	Asus 27 " MG279Q ips 140hz
Case	Cooler Master ATCS. 840
Audio Device(s)	On Board
Power Supply	Seasonic 650W Focus
Software	Windows10 Home Premium 64bit

Processor	Athlon64 3500+(2.2gz)@2.94gz(3.03gz)
Motherboard	Biostar Tforce550 (RMA) (m2n-sli delux)
Cooling	PIB cooler
Memory	2gb ocz 533 +1gb samsung 533 4-4-4-12
Video Card(s)	x1900xtx 512mb+zalman vf900 cooler(kicks stock coolers arse)
Storage	80gb,200gb,250gb,160gb
Display(s)	20.1 in dell 2001fp + KDS visual sensations 19"
Case	Codegen briza seirse
Audio Device(s)	ADI SoundMax HD audio onboard,using Ket's driver pack
Power Supply	FSP 400watt SAGA seirse w/noise killer
Software	Windows 2003 ent server as workstation(kills xp in perf and stab)

System Name	Money Guilt / immortal X58
Processor	5600X / X5660
Motherboard	MSI B550 GAMING PLUS / MSI-x58-PLAT
Cooling	Cooler Master - Hyper 212 / Monsoon 3 Dual 120 fans
Memory	OLOy WarHawk 2x8 / 2x4 gig Gskill 1600
Video Card(s)	EVGA 3060 Ti FTW3 / R9 290 Powercolor PCS+
Storage	Crucial P5 1TB / 128gig Samsung D830 2x1 Terabyte Seagates Raid0
Display(s)	VIOTEK 32-In 2560x1440 Curved 144Hz / Acer 22in 1920x1080 120Hertz
Case	NZXT - H510 Compact / Thermaltake V9 Black Edition
Audio Device(s)	Soundblaster Audigy FX
Power Supply	CORSAIR - RM Series 750W / Ocz 700 Modular
Mouse	G403 / basic
Keyboard	G15 / basic
Software	Windows 10
Benchmark Scores	http://www.3dmark.com/spy/18067733, http://www.3dmark.com/fs/24836348 http://www.3dmark.com/fs/11606

System Name	All the cores
Processor	2990WX
Motherboard	Asrock X399M
Cooling	CPU-XSPC RayStorm Neo, 2x240mm+360mm, D5PWM+140mL, GPU-2x360mm, 2xbyski, D4+D5+100mL
Memory	4x16GB G.Skill 3600
Video Card(s)	(2) EVGA SC BLACK 1080Ti's
Storage	2x Samsung SM951 512GB, Samsung PM961 512GB
Display(s)	Dell UP2414Q 3840X2160@60hz
Case	Caselabs Mercury S5+pedestal
Audio Device(s)	Fischer HA-02->Fischer FA-002W High edition/FA-003/Jubilate/FA-011 depending on my mood
Power Supply	Seasonic Prime 1200w
Mouse	Thermaltake Theron, Steam controller
Keyboard	Keychron K8
Software	W10P

System Name	Not named
Processor	Intel 8700k @ 5Ghz
Motherboard	Asus ROG STRIX Z370-E Gaming
Cooling	DeepCool Assassin II
Memory	16GB DDR4 Corsair LPX 3000mhz CL15
Video Card(s)	Zotac 1080 Ti AMP EXTREME
Storage	Samsung 960 PRO 512GB
Display(s)	24" Dell IPS 1920x1200
Case	Fractal Design R5
Power Supply	Corsair AX760 Watt Fully Modular

Processor	DualCore AMD Athlon 64x2 4800+ (o/c 2801mhz STABLE (Ketxxx, POGE, Tatty One, ME))
Motherboard	ASUS A8N-SLI Premium (PCIe x16, x4, x1)
Cooling	PhaseChange Coolermaster CM754/939 (fan/heatsink), Thermalright heatspreaders + fan built on (RAM)
Memory	512mb PC-3200 DDR400 (set DDR-33 for o/c) by Corsair (matched pair, 2x256mb) 200.1/200mhz
Video Card(s)	BFG GeForce 7900 GTX OC 512mb GDDR3 ram (o/c manually to 686 core/865 memory) - PhaseChange cooled
Storage	Dual "Raptor X" 16mb 10krpm/RAID 0 Promise EX8350 x4 PCIe 128mb & Intel IO chip/CENATEK RocketDrive
Display(s)	SONY 19" Trinitron MultiScan 400ps 1600x1200 75hz refresh 32-bit color
Case	Antec Super-LanBoy (aluminum baby-tower w/ lower front & upper rear cooling exhaust fans)
Audio Device(s)	RealTek AC97 onboard mobo stereo sound (Altec Lansing ACS-45 speakers - 10 yrs. still running!)
Power Supply	Antec 500w ATX 2.0 "SmartPower" powersupply
Software	Windows Server 2003 SP #1 fully patched, & massively tuned/tweaked to-the-max (plus latest drivers)

Comparison 2 AMD Processors

New Member

New Member

New Member

New Member

Heedless Psychic

New Member

where the hell are my stars

New Member

New Member

New Member

New Member

where the hell are my stars

New Member

New Member

New Member

New Member

New Member

New Member

New Member