Wednesday, January 14th 2009

AMD Justifies Use of Large L3 Cache on Phenom II, Opteron

AMD's introduction of the Phenom II series processors served several purposes and goals for the company, mainly porting the processor technology to the newer 45nm SOI manufacturing node, to attempt to bring down manufacturing cost. This also meant that AMD could trade-off bringing down manufacturing cost with stepping up transistor counts on a die that is nearly the size of that of the 65nm Barcelona/Agena. The 45nm Shanghai/Deneb has a distinct feature over its predecessor: three times the amount of L3 cache. The larger cache significantly adds to the transistor count of the die: 758 million as against the 468 million on Barcelona/Agena. Replying to an inquiry of Hardware-Infos, AMD attempts to explain its motive behind incorporating the large L3 cache, while trading-off with savings of die-size and alleged latencies the L3 cache brings in.

AMD points out that expanding the L3 cache was important to the architecture in more ways than one. On the desktop/client PC front, the additional L3 cache was expected to provide a 5% performance increment over its predecessor. The reviews later backed AMD's assertion. Secondly, AMD likes to maintain an essentially common die design for both its client (Phenom II/Deneb) and enterprise or server (Opteron/Shanghai), to make sure manufacturing costs aren't wasted in setting up a separate manufacturing node. With the enterprise-grade Opteron processors, the 6 MB L3 cache has proven to benefit the processor in dealing with large server workloads. Finally, AMD claims that despite the larger cache, the overall die-area of the 45nm die remains lesser than that of the 65nm Stars die, so cost-cutting remains to an extant.

Source: Hardware-Infos
Add your own comment

28 Comments on AMD Justifies Use of Large L3 Cache on Phenom II, Opteron

#1
1Kurgan1
The Knife in your Back
No brainer here, was a good move on AMD's part.
Posted on Reply
#2
Kei
I agree, I think they wanted to do it the first time but with the situation just realized it wasn't a good move.

Kei
Posted on Reply
#3
KBD
I wonder when are we going to see some Shanghai Opterons and new enterprise boards to go with them? I think this is one of the weak points of AMD right now. The acqusition of ATI cost them dearly yet they havent taken advantage of having a chipset division to release some new server grade chipsets with HT 3.0 under the AMD brand. Nvidia hasnt made any new nforce professional boards either, btw. They really gotta get the ball rolling on this as the enterprise sector is one of AMD's key customers. Otherwise Intel will just push them out of it with their own Xeons based on i7.
Posted on Reply
#4
mdm-adph
I don't think AMD needs to "justify" having a large L3 Cache on PII's any more than Intel needs to "justify" having a large L2 one on the later Core2's. :p
Posted on Reply
#5
btarunr
Editor & Senior Moderator
by: mdm-adph
I don't think AMD needs to "justify" having a large L3 Cache on PII's any more than Intel needs to "justify" having a large L2 one on the later Core2's. :p
There was some drama alleging AMD using a 6MB cache as a marketing gimmick to make PhII's specs look 1337. Also that many believed the L3 was just bringing in a wasteful amount of latency and not genuinely benefiting, also that more cache = more power draw. Looking at these "allegations" coming at it from quite some directions, AMD seems to have taken interest in answering Hardware-Infos.
Posted on Reply
#6
1Kurgan1
The Knife in your Back
I could see people saying that, but looking at reviews its pretty obvious it was a benefit to the chips, and saying that it draws more power than having no L3 is obvious. But the wattage these processors take is signifigantly reduced over the last generation. Only real flak I could see them catching is from the Intel side.
Posted on Reply
#7
KBD
by: 1Kurgan1
I could see people saying that, but looking at reviews its pretty obvious it was a benefit to the chips, and saying that it draws more power than having no L3 is obvious. But the wattage these processors take is signifigantly reduced over the last generation. Only real flak I could see them catching is from the Intel side.
agreed, performance is most important. As long as the bigger cache (among other things) improves peformance AMD shouldnt be concerned with what naysayers have to say. If anything i think the larger cache helped Phenom 2 a great deal, 2MB L3 was too small for the Phenom 1 anyway.
Posted on Reply
#8
Valdez
by: mdm-adph
I don't think AMD needs to "justify" having a large L3 Cache on PII's any more than Intel needs to "justify" having a large L2 one on the later Core2's. :p
Large cache is needed because of the slow fsb.

With HT or QPI there is no need for large caches, however there are always some exceptions.
Posted on Reply
#9
FordGT90Concept
"I go fast!1!11!1!"
L3 or not to L3...

L3 is beneficial only when it has faster access than the system RAM.

The main problem with L3 is that it takes up a lot of real estate and therefore it costs a lot.

AMD and Intel both are moving in the same direction here (to add L3). Why? I think it is because of the direction system memory is moving. Yes, DDR2 has more bandwidth than DDR and DDR3 has more bandwidth than DDRw but at what cost? Latency. DDR3 is a friggin snail compared to DDR when it comes to latency. The only way to address that is to add a middle man to compensate for the huge gap between L2 and DDR3, that is, to add an L3. If DDR3 was actually fast, L3 would be useless.

So effectively, we pay more (relatively speaking) for a lower latency processor and less for higher latency memory. Is that a good thing? I don't know.
Posted on Reply
#10
KBD
by: FordGT90Concept
L3 or not to L3...

L3 is beneficial only when it has faster access than the system RAM.

The main problem with L3 is that it takes up a lot of real estate and therefore it costs a lot.

AMD and Intel both are moving in the same direction here (to add L3). Why? I think it is because of the direction system memory is moving. Yes, DDR2 has more bandwidth than DDR and DDR3 has more bandwidth than DDRw but at what cost? Latency. DDR3 is a friggin snail compared to DDR when it comes to latency. The only way to address that is to add a middle man to compensate for the huge gap between L2 and DDR3, that is, to add an L3. If DDR3 was actually fast, L3 would be useless.

So effectively, we pay more (relatively speaking) for a lower latency processor and less for higher latency memory. Is that a good thing? I don't know.
interesting read :toast:

may be this is why AMD made the L3 bigger since its moving to DDR3? But then again i was reading somewhere that the smaller 2MB L3 cache was hurting Phenom 1.
Posted on Reply
#11
TheGuruStud
It has 6MB b/c it's a server chip. That's all AMD had to say haha.
Posted on Reply
#13
newtekie1
Semi-Retired Folder
AMD doesn't need to justify it to anyone.

But if they did, "It improve performance" is all they need to say.
Posted on Reply
#14
eidairaman1
with everyone being so harsh on them etc i guess they had to explain to the idiots why they did that.
Posted on Reply
#15
swaaye
It certainly didn't help much though. So maybe it wasn't worth it in the end? Sticking with the 2MB L3 would've made for a very small quad core die with definite cost reduction. All that big 6MB cache did was maybe get them up to about Kentsfield perf per clock.

If only all AMD had to compete against was some ghetto P4-based quad core. We'd all be singing the virtues of AMD's amazingly advanced ultra-fast quad. ;)
Posted on Reply
#16
eidairaman1
by: swaaye
It certainly didn't help much though. So maybe it wasn't worth it in the end? Sticking with the 2MB L3 would've made for a very small quad core die with definite cost reduction. All that big 6MB cache did was maybe get them up to about Kentsfield perf per clock.

If only all AMD had to compete against was some ghetto P4-based quad core. We'd all be singing the virtues of AMD's amazingly advanced ultra-fast quad. ;)
for your sarcasm, the cache is there to compensate for the slow DDR2 and DDR3, because technically if DDR were to get a process Shrink, it would probably maintain faster speeds in clocks and latency thus lay a beatdown on DDR2 and DDR3 tech.
Posted on Reply
#17
TheGuruStud
by: eidairaman1
for your sarcasm, the cache is there to compensate for the slow DDR2 and DDR3, because technically if DDR were to get a process Shrink, it would probably maintain faster speeds in clocks and latency thus lay a beatdown on DDR2 and DDR3 tech.
Thank you, I still love DDR1 and grudgingly had to go to DDR2 (and 3 really pisses me off).

Everyone just wants high MHz for marketing and as everyone with any know-how knows, it doesn't mean shit.
Posted on Reply
#18
KBD
by: TheGuruStud

Everyone just wants high MHz for marketing and as everyone with any know-how knows, it doesn't mean shit.
well, it means a lot for intel 775 and earlier as intel systems benefit a lot more from Mhz than lower timings. On the other hand, AMD ( and i think i7) benefit more from lower timings than higher Mhz. But yea, i still love DDR, still have it running on my socket A rig and its just great.
Posted on Reply
#19
eidairaman1
by: TheGuruStud
Thank you, I still love DDR1 and grudgingly had to go to DDR2 (and 3 really pisses me off).

Everyone just wants high MHz for marketing and as everyone with any know-how knows, it doesn't mean shit.
Intel keeps on pushing it so they can stay technically ahead of AMD, thats why PCI Express came about aswell, where it was AMD for a time that was ahead, AMD only released AM2 because they were being forced to, where with a low latency solution clock speed isnt necessarily needed.

BTW i gotta Ask KBD whats your Socket A rig consist of, you see mine in the system specs as that is my main machine.
Posted on Reply
#20
KBD
by: eidairaman1

BTW i gotta Ask KBD whats your Socket A rig consist of, you see mine in the system specs as that is my main machine.
Yea, i noticed, you have a kickass Socket A rig, mine is not as fancy as yours though. Basically its my Win 98 machine for old games as well as a backup system. I got an Athlon XP 2600+ (T-bred), Asus A7N8X w/ TT TR2-M3 CPU cooler, 2GB of Mushkin HP-3200, eVGA 6800 Ultra AGP, 80GB Barracuda 7200.10 IDE, onboard sound, Seasonic M12 500W.

I'm actually looking for a SATA board for that rig, if i ever find like yours i'd be very happy, afterall thats the best Socket A board ever.
Posted on Reply
#21
eidairaman1
the other is the Abit NF7-S 1.1-2.0 / AN7, im unsure of KW7/G because 32bit OS limitation (support 4 Gigs Ram)

But ya, this machine is somewhat limited due to no cooling for Mosfet/VRMs, and the NB and SB cooling is pretty dismal aswell (Chipset runs hotter than CPU) so all that stuff needs upgrades. I had a 3200+ in this machine and i crushed it accidently (never modifiy the clip on a CPU cooler by bending it to put more tension on the CPU) So i picked up 2 2500+ CPUs (Mobility and Desktop). Both do 3200+ easily so thats where i run the current, just want to get all the cooling upgrades and Possibly overclock this CPU to at least 2.5, as these CPUs like 1.7-2.0 Vcore. Above all, COD4 is a good test of a Machines stability, and 2.2Ghz runs that game excellent with the vidcard i have.
Posted on Reply
#22
TheGuruStud
Hehe, my sister's old comp (now the shop comp) does 2.4 (only slight V bump) on an athlon 2600 with generic ram and a biostar M7ncd mobo :)

That's prime stable and I didn't attempt to push it further.

I love the skt As b/c of the pin mods. You can overclock no matter what the MB is, practically (but I just upped this FSB on this one, ocing the ram also).
Posted on Reply
#24
btarunr
Editor & Senior Moderator
I wish there was a BIOS option to disable just L3, so we could see its impact. I know the "External Cache" option enables/disables the CPU's L2 caches, and on K10/K10.5 machines, both L2 and L3.
Posted on Reply
#25
newtekie1
Semi-Retired Folder
The latency on good DDR2-800 RAM is the same as that on good DDR-400. DDR2-800@4-4-4-12 is the same latency time as DDR-400@2-2-2-6, those are actually pretty damn good latenncies for DDR. Even if you look at some of the best DDR, you can find DDR-500 with latencies of 3-3-2-8, and you can find DDR2-533 with 3-3-3-8, or even DDR2-1066 with 5-5-5-15, which actually has better latencies. People see the higher timings of DDR2 and 3, and don't take into consideration the higher clock speeds.
Posted on Reply
Add your own comment