![]() |
|
|
#1 |
![]() Join Date: May 2006
Posts: 126 (0.05/day)
Thanks: 0
Thanked 0 Times in 0 Posts
|
L2 cache question
All amd seems to have 128 L1 cache but different L2 cache. So what do L2 cache effect? Will there be large performace gain with a larger L2 cache? And if so what kind of appilcation benefit from a larger L2 cache.
|
|
|
|
|
|
#2 |
![]() |
To tell you the truth, I don't know, but whenever I hear about processor specs, I always hear about L2 Cache and barely anything about L1. I would like to know myself too.
|
|
|
|
|
|
#3 |
|
Eligible for custom title
|
Prefetch Logic and data set size.
Those are both prime examples of a larger L2 cache. Prefetch Logic from experiance is if a piece of code is used then it will be used again. So by keeping it in resident full speed on die(L2) memory you reduce latentcy, thereby improving performance. It also applies when there is a known sequential pattern occuring, it can preload data to be processed at higher rates than if the same data has to be loaded from main memory. A good example of this would be Folding @ Home. The branching of the program is limited and thereby most used code can be kept in cache. On this there is another value that effects the caches efficency, they are called pipelines, in the example of Pentum 4 chips they have more, and in certain cultivated applications the extra pipes are beneficial. However in gaming, and other applications it hinders the performance. L1 cache is the next in line code to be executed, so it need not be any larger than the data set+instruction set. The benefit of larger L2 is seen, but there are far greater benefits to more accurate prediction. |
|
|
|
|
|
#4 |
|
Eligible for custom title
Join Date: Jan 2005
Location: England
Posts: 5,047 (1.66/day)
Thanks: 134
Thanked 276 Times in 185 Posts
|
From what I've heard, in general 512KB is good enough, 1MB doesn't have that much of a performance benefit in general, and 256KB and below isn't too great (this is generally Celerons and Semprons nowadays).
|
|
|
|
|
|
#5 | |
|
Banned
Join Date: May 2006
Location: Someone who's going to find NewTekie1 and teach him a lesson
Posts: 3,380 (1.32/day)
Thanks: 0
Thanked 102 Times in 101 Posts
|
Quote:
Keeping in line w/ what Steevo wrote, the most used code (on a per-processor basis) that stays in that cache (with data too possibly), the faster it is run, because the CPU has it "nearer" to it (no going onto system board bus, slower & to RAM there, also slower), & also faster. NOTE: Using "Processor Affinity" settings (via taskmgr.exe, or in your code itself) can affect this (adversely OR positively) though on SMP/DualCore/HyperThreaded machines: Negatively - Creating cache pollution! When a processor repeatedly runs a particular process, the CPU's cache RAM stored the most commonly used instructions. IMPORTANT (keeping in line with Steevo's "branch-prediction/speculative execution failure" example above as well as another possible "pipeline stall/bubble") - If you introduce a NEW process to that particular cpu (from a set of them in SMP rigs), you could potentially overwrite a previous repeatedly run process' cached instruction set. The process scheduler module of the kernel uses this ruleset to run processes, in this order: 1.) Select the "ideal" processor (one with most available cycles open/unused) 2.) If "ideal processor" is not available, then see if processor the code ran on last has available cycles left to run said cached process code again! NOTE - on a heavily used/multiple large application program running SMP system, a particular processes' code is rarely run on the same CPU again, keep this in mind (this opens up cache pollution possibles too, not just manually playing with affinity - this, imo, is an area that needs improvement! (That last comment's definitely in keeping w/ Steevo's "speculative-execution/branch-prediction failure" example above, & I am in agreement with - Microsoft IS working on improvements here for use of L1-L2 caches more efficiently by the OS, this much I know of, for VISTA)). 3.) If neither available? Select other "idle" processor if any (or a processor running a lower priority thread) OR Positively - Tuning certain apps in the OS process scheduler (controls threads, smallest atomic unit of execution on most modern OS') to run on ONLY a certain processor(s) <- Plural here, because SMP/DualCore/H-T systems exist out there. EXAMPLE -> Say you have a 4-8 way SMP rig, running SQL Server &/or Exchange Server on it... & you note that SQLServer uses 3x as many CPU cycles as does Exchange Server - You might wish to set "hard affinity" via taskmgr.exe to SQLServer & have it use CPU 0,1,2,3, & 4 exclusively, and CPU 5 & 6 to run Exchange Server exclusively... TO AVOID CACHE POLLUTION! APPS THAT GAIN BY LARGER L1-L2 CACHE SIZES: SETI@Home benefits by it, this I am certain of (& it appears that Folding@Home does as well, as Steevo noted above) for an example of an application that largely gains from L1-L2 cache memory onboard cpu being larger... Games are often stated to be yet another. This is illustrated, or was in the past that I know of & noted, when Intel's "Extreme Edition" cpu's came out - basically Xeon server cpu's, with (considered then @ its release time) HUGE L1-L2 cache setups. They dusted their competition largely in gaming @ the time of their release... & were largely created for gaming enthusiasts iirc as well, demonstrating that larger caches on the CPU help games! APK Last edited by Alec§taar; Jun 25, 2006 at 11:48 AM. |
|
|
|
|
|
|
#6 |
|
Eligible for custom title
Join Date: Jan 2005
Location: England
Posts: 5,047 (1.66/day)
Thanks: 134
Thanked 276 Times in 185 Posts
|
Where was the
Alec§taar?
|
|
|
|
|
|
#7 | |
|
Banned
Join Date: May 2006
Location: Someone who's going to find NewTekie1 and teach him a lesson
Posts: 3,380 (1.32/day)
Thanks: 0
Thanked 102 Times in 101 Posts
|
Quote:
![]() (There it is...) APK |
|
|
|
|
|
|
#8 |
![]() Join Date: Jun 2006
Posts: 83 (0.03/day)
Thanks: 0
Thanked 1 Time in 1 Post
|
L2 cache is used primarily to mask ram latency.
it helps largely in multitasking.
__________________
DFI LANPARTY UT NF590 SLI-M2R/G
X2 3800+EE AM2 2x1GB OCZ PC2-6400 Platinum XTC XFX 7900GS Maxtor 6L250S0 Maxtor 6V300F0 Western Digital WD2500JS |
|
|
|
|
|
#9 | |
|
Banned
Join Date: May 2006
Location: Someone who's going to find NewTekie1 and teach him a lesson
Posts: 3,380 (1.32/day)
Thanks: 0
Thanked 102 Times in 101 Posts
|
Quote:
"the most used code (on a per-processor basis) that stays in that cache (with data too possibly), the faster it is run, because the CPU has it "nearer" to it (no going onto system board bus, slower & to RAM there, also slower), & also faster." ![]() * 6 of 1, 1/2 dozen of the other... APK |
|
|
|
|
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
|
|