techPowerUp! Forums

Go Back   techPowerUp! Forums > Hardware > General Hardware

Reply
 
Thread Tools
Old Jun 24, 2006, 04:13 AM   #1
macbeth
75 Posts
 
macbeth's Avatar
 
Join Date: May 2006
Posts: 126 (0.05/day)
Thanks: 0
Thanked 0 Times in 0 Posts

L2 cache question

All amd seems to have 128 L1 cache but different L2 cache. So what do L2 cache effect? Will there be large performace gain with a larger L2 cache? And if so what kind of appilcation benefit from a larger L2 cache.
macbeth is offline  
Reply With Quote
Old Jun 24, 2006, 04:28 AM   #2
Azn Tr14dZ
3500 Posts
 
Join Date: Mar 2006
Posts: 4,415 (1.68/day)
Thanks: 16
Thanked 19 Times in 12 Posts
Send a message via AIM to Azn Tr14dZ

System Specs

To tell you the truth, I don't know, but whenever I hear about processor specs, I always hear about L2 Cache and barely anything about L1. I would like to know myself too.
Azn Tr14dZ is offline  
Reply With Quote
Old Jun 24, 2006, 05:35 AM   #3
Steevo
Eligible for custom title
 
Steevo's Avatar
 
Join Date: Nov 2005
Posts: 5,567 (2.02/day)
Thanks: 238
Thanked 979 Times in 729 Posts

System Specs

Prefetch Logic and data set size.


Those are both prime examples of a larger L2 cache.


Prefetch Logic from experiance is if a piece of code is used then it will be used again. So by keeping it in resident full speed on die(L2) memory you reduce latentcy, thereby improving performance. It also applies when there is a known sequential pattern occuring, it can preload data to be processed at higher rates than if the same data has to be loaded from main memory. A good example of this would be Folding @ Home. The branching of the program is limited and thereby most used code can be kept in cache.


On this there is another value that effects the caches efficency, they are called pipelines, in the example of Pentum 4 chips they have more, and in certain cultivated applications the extra pipes are beneficial. However in gaming, and other applications it hinders the performance.


L1 cache is the next in line code to be executed, so it need not be any larger than the data set+instruction set.



The benefit of larger L2 is seen, but there are far greater benefits to more accurate prediction.
Steevo is offline  
Reply With Quote
Old Jun 24, 2006, 08:59 PM   #4
Jimmy 2004
Eligible for custom title
 
Jimmy 2004's Avatar
 
Join Date: Jan 2005
Location: England
Posts: 5,047 (1.66/day)
Thanks: 134
Thanked 276 Times in 185 Posts
Send a message via MSN to Jimmy 2004

System Specs

From what I've heard, in general 512KB is good enough, 1MB doesn't have that much of a performance benefit in general, and 256KB and below isn't too great (this is generally Celerons and Semprons nowadays).
Jimmy 2004 is offline  
Reply With Quote
Old Jun 24, 2006, 09:44 PM   #5
Alec§taar
Banned
 
Alec§taar's Avatar
 
Join Date: May 2006
Location: Someone who's going to find NewTekie1 and teach him a lesson
Posts: 3,380 (1.32/day)
Thanks: 0
Thanked 102 Times in 101 Posts

System Specs

Quote:
Originally Posted by macbeth
All amd seems to have 128 L1 cache but different L2 cache. So what do L2 cache effect? Will there be large performace gain with a larger L2 cache? And if so what kind of appilcation benefit from a larger L2 cache.
Yes, there is a gain with a larger onboard CPU cache.

Keeping in line w/ what Steevo wrote, the most used code (on a per-processor basis) that stays in that cache (with data too possibly), the faster it is run, because the CPU has it "nearer" to it (no going onto system board bus, slower & to RAM there, also slower), & also faster.

NOTE: Using "Processor Affinity" settings (via taskmgr.exe, or in your code itself) can affect this (adversely OR positively) though on SMP/DualCore/HyperThreaded machines:

Negatively - Creating cache pollution! When a processor repeatedly runs a particular process, the CPU's cache RAM stored the most commonly used instructions.

IMPORTANT (keeping in line with Steevo's "branch-prediction/speculative execution failure" example above as well as another possible "pipeline stall/bubble") - If you introduce a NEW process to that particular cpu (from a set of them in SMP rigs), you could potentially overwrite a previous repeatedly run process' cached instruction set.

The process scheduler module of the kernel uses this ruleset to run processes, in this order:

1.) Select the "ideal" processor (one with most available cycles open/unused)

2.) If "ideal processor" is not available, then see if processor the code ran on last has available cycles left to run said cached process code again!

NOTE - on a heavily used/multiple large application program running SMP system, a particular processes' code is rarely run on the same CPU again, keep this in mind (this opens up cache pollution possibles too, not just manually playing with affinity - this, imo, is an area that needs improvement!

(That last comment's definitely in keeping w/ Steevo's "speculative-execution/branch-prediction failure" example above, & I am in agreement with - Microsoft IS working on improvements here for use of L1-L2 caches more efficiently by the OS, this much I know of, for VISTA)).

3.) If neither available? Select other "idle" processor if any (or a processor running a lower priority thread)

OR

Positively - Tuning certain apps in the OS process scheduler (controls threads, smallest atomic unit of execution on most modern OS') to run on ONLY a certain processor(s) <- Plural here, because SMP/DualCore/H-T systems exist out there.

EXAMPLE -> Say you have a 4-8 way SMP rig, running SQL Server &/or Exchange Server on it... & you note that SQLServer uses 3x as many CPU cycles as does Exchange Server - You might wish to set "hard affinity" via taskmgr.exe to SQLServer & have it use CPU 0,1,2,3, & 4 exclusively, and CPU 5 & 6 to run Exchange Server exclusively... TO AVOID CACHE POLLUTION!

APPS THAT GAIN BY LARGER L1-L2 CACHE SIZES:

SETI@Home benefits by it, this I am certain of (& it appears that Folding@Home does as well, as Steevo noted above) for an example of an application that largely gains from L1-L2 cache memory onboard cpu being larger...

Games are often stated to be yet another. This is illustrated, or was in the past that I know of & noted, when Intel's "Extreme Edition" cpu's came out - basically Xeon server cpu's, with (considered then @ its release time) HUGE L1-L2 cache setups.

They dusted their competition largely in gaming @ the time of their release... & were largely created for gaming enthusiasts iirc as well, demonstrating that larger caches on the CPU help games!

APK

Last edited by Alec§taar; Jun 25, 2006 at 11:48 AM.
Alec§taar is offline  
Reply With Quote
Old Jun 25, 2006, 10:07 AM   #6
Jimmy 2004
Eligible for custom title
 
Jimmy 2004's Avatar
 
Join Date: Jan 2005
Location: England
Posts: 5,047 (1.66/day)
Thanks: 134
Thanked 276 Times in 185 Posts
Send a message via MSN to Jimmy 2004

System Specs

Where was the Alec§taar?
Jimmy 2004 is offline  
Reply With Quote
Old Jun 25, 2006, 11:47 AM   #7
Alec§taar
Banned
 
Alec§taar's Avatar
 
Join Date: May 2006
Location: Someone who's going to find NewTekie1 and teach him a lesson
Posts: 3,380 (1.32/day)
Thanks: 0
Thanked 102 Times in 101 Posts

System Specs

Quote:
Originally Posted by Jimmy 2004
Where was the Alec§taar?


(There it is...)

APK
Alec§taar is offline  
Reply With Quote
Old Jun 26, 2006, 05:49 AM   #8
WeStSiDePLaYa
75 Posts
 
WeStSiDePLaYa's Avatar
 
Join Date: Jun 2006
Posts: 83 (0.03/day)
Thanks: 0
Thanked 1 Time in 1 Post

L2 cache is used primarily to mask ram latency.

it helps largely in multitasking.
__________________
DFI LANPARTY UT NF590 SLI-M2R/G
X2 3800+EE AM2
2x1GB OCZ PC2-6400 Platinum XTC
XFX 7900GS
Maxtor 6L250S0
Maxtor 6V300F0
Western Digital WD2500JS
WeStSiDePLaYa is offline  
Reply With Quote
Old Jun 26, 2006, 04:18 PM   #9
Alec§taar
Banned
 
Alec§taar's Avatar
 
Join Date: May 2006
Location: Someone who's going to find NewTekie1 and teach him a lesson
Posts: 3,380 (1.32/day)
Thanks: 0
Thanked 102 Times in 101 Posts

System Specs

Quote:
Originally Posted by WeStSiDePLaYa
L2 cache is used primarily to mask ram latency.

it helps largely in multitasking.
Above quote is as good a way of putting it as any & makes the point well - I phrased it this way above myself & pretty much means the same thing:

"the most used code (on a per-processor basis) that stays in that cache (with data too possibly), the faster it is run, because the CPU has it "nearer" to it (no going onto system board bus, slower & to RAM there, also slower), & also faster."



* 6 of 1, 1/2 dozen of the other...

APK
Alec§taar is offline  
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 05:16 PM.


Powered by vBulletin® Version 3.8.6
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.
no new posts