1. Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Orochi ''Bulldozer'' Die Holds 16 MB Cache

Discussion in 'News' started by btarunr, Sep 24, 2010.

  1. btarunr

    btarunr Editor & Senior Moderator Staff Member

    Joined:
    Oct 9, 2007
    Messages:
    28,959 (11.01/day)
    Thanks Received:
    13,757
    Location:
    Hyderabad, India
    Documents related to the "Orochi" 8-core processor by AMD based on its next-generation Bulldozer architecture reveal its cache hierarchy that comes as a bit of a surprise. Earlier this month, at a GlobalFoundries hosted conference, AMD displayed the first die-shot of the Orochi die, which legibly showed key features including the four Bulldozer modules which hold two cores each, and large L2 caches. In coarse visual inspection, the L2 cache of each module seems to cover 35% of its area. L3 cache is located along the center of the die. The documents seen by X-bit Labs reveal that each Bulldozer module has its own 2 MB L2 cache shared between two cores, and an L3 cache shared between all four modules (8 cores) of 8 MB.

    This takes the total cache count of Orochi all the way up to 16 MB. This hierarchy suggests that AMD wants to give individual cores access to a large amount of faster cache (that's a whopping 2048 KB compared to 512 KB per core on Phenom, and 256 KB per core on Core i7), which facilitates faster inter-core, intra-module communication. Inter-module communication is enhanced by the 8 MB L3 cache. Compared to the current "Istanbul" six-core K10-based die, that's a 77% increase in cache amount for a 33% core count increase, 300% increase in L2 cache per core. Orochi is built on a 32 nm GlobalFoundries process, it is sure to have a very high transistor count.

    Source: Xbit Labs
     
  2. b82rez New Member

    Joined:
    Apr 17, 2010
    Messages:
    134 (0.08/day)
    Thanks Received:
    18
    Location:
    Hobart, Australia
    BL GG Intel Fanboys. AMD is back! :nutkick:
     
    de.das.dude says thanks.
  3. bpgt64

    bpgt64

    Joined:
    Oct 5, 2008
    Messages:
    1,479 (0.65/day)
    Thanks Received:
    193
    Location:
    ATL, GA
    I'll believe it's a performance gain when I see the benchmarks. Regardless of which side you take, competition is always good for the consumer.
     
    wolf says thanks.
  4. KainXS

    KainXS

    Joined:
    Sep 25, 2007
    Messages:
    5,603 (2.12/day)
    Thanks Received:
    502
    wait for benchmarks before you start that, we've been through that before with amd
     
    wolf says thanks.
  5. wolf

    wolf Performance Enthusiast

    Joined:
    May 7, 2007
    Messages:
    5,546 (1.99/day)
    Thanks Received:
    846
    silly :slap:

    cache isn't everything, reviews pretty much are.
     
  6. ebolamonkey3

    ebolamonkey3 New Member

    Joined:
    Apr 9, 2010
    Messages:
    773 (0.45/day)
    Thanks Received:
    102
    Location:
    Atlanta/Marietta, GA
    2011 is shaping up to be quite an interesting year :)
     
  7. Completely Bonkers New Member

    Joined:
    Feb 6, 2007
    Messages:
    2,580 (0.90/day)
    Thanks Received:
    516
    I remember the "massive cache" Gallatin P4's over Northwood. Didn't make more than 5% difference clock for clock except in very special circumstances.

    So let's wait for benchmarks.

    I would have thought there would be better gains by rethinking cache and memory entirely, possibly producing a separate socket for L3 cache just like in the old days. It would be so much cheaper to do it that way, you could easily pack 256MB cache. Yes, the latency would be worse than current on-die L3 cache, but with the space, heat and transistors saved, you could bump up L1 and L2 cache and win back any performance losses. Plus you could build your L3 cache to order.
     
    WarEagleAU says thanks.
  8. DaMulta

    DaMulta My stars went supernova

    Joined:
    Aug 3, 2006
    Messages:
    16,132 (5.27/day)
    Thanks Received:
    1,457
    Location:
    Oklahoma T-Town
    That's it????? I wait for the day with 16 cores with 64MB of Cache
     
  9. dir_d

    dir_d

    Joined:
    Sep 1, 2009
    Messages:
    848 (0.44/day)
    Thanks Received:
    110
    Location:
    Manteca, Ca
    Well it seems Bulldozer is going to be faster when communicating with memory and other cores. I think if AMD just did that to a phenom 2 chip it would speed it up significantly. I really cant wait to see bulldozer in action.
     
  10. bear jesus

    bear jesus New Member

    Joined:
    Aug 12, 2010
    Messages:
    1,535 (0.96/day)
    Thanks Received:
    200
    Location:
    Britland
    I would hope more faster cache could be a good thing but the main thing im interested in is how each modual performs, i'm really thinking about getting a high end sandy bridge or bulldozer to last me a couple years or so and that means i want as many and as fast a cores as possible as i would hope over the next few years more software will use more cores.
     
  11. Rebelstar

    Rebelstar New Member

    Joined:
    Sep 3, 2010
    Messages:
    71 (0.05/day)
    Thanks Received:
    15
    Location:
    Minsk, Belarus
    I'm totally noob in CPU technologies but I think 16MB cache it's a freaking cool, right?
     
    Last edited: Sep 24, 2010
  12. xaira

    Joined:
    Jul 20, 2009
    Messages:
    209 (0.11/day)
    Thanks Received:
    19
    so does fermi, i hope amd has the tdp under control, otherwise sandy will kick butt
     
  13. bear jesus

    bear jesus New Member

    Joined:
    Aug 12, 2010
    Messages:
    1,535 (0.96/day)
    Thanks Received:
    200
    Location:
    Britland
    It could be if put to use well but the core's are really importaint, either way we won't know untill the reviews really.
     
  14. devguy

    devguy

    Joined:
    Feb 17, 2007
    Messages:
    1,239 (0.43/day)
    Thanks Received:
    171
    Location:
    SoCal
    One design win I really commend AMD for is their use of dynamic cache allocation between the "cores" on a module. While many assume the sharing of cache (and other items like the FPU) will hurt single threaded performance, that really isn't the case. When only one core is active per module, it has complete control over all the resources; thus a single core will have 2mb L2 cache at its disposal! Also, when both cores on a module are active, they can inequitably share the resources (ie one core with .5mb L2 and another with 1.5mb L2 is possible). Very cool technology.

    For Bulldozer, there will be the option to have the OS prefer loading one core per module (like cores 1, 3, 5, 7) rather than just filling them up by modules (1, 2, 3, 4). Both have benefits and faults: the first route has higher performance, but also higher power consumption; the second would be the exact opposite.

    As far as the sharing of the FPU, in servers it will make hardly any difference. In the desktop segment, AMD argues that should you be doing something that takes up so much FPU performance to slow down our modules, then you should be doing it on the GPU instead.
     
    bear jesus says thanks.
  15. cadaveca

    cadaveca My name is Dave

    Joined:
    Apr 10, 2006
    Messages:
    14,131 (4.45/day)
    Thanks Received:
    7,330
    Location:
    Edmonton, Alberta
    I like this news. I ahve been saying for a couple of years now that AMD's cache design needed to cahnge, and here, they are doing something about it. That makes me even more interested in Bulldozer tech.
     
  16. bear jesus

    bear jesus New Member

    Joined:
    Aug 12, 2010
    Messages:
    1,535 (0.96/day)
    Thanks Received:
    200
    Location:
    Britland
    I never knew it would be set up like that, kind of makes me even more sure i want to wait for bulldozer for my next full upgrade so that if it is a good cpu at a good price i can go for one or if not then i can get somethign from sandy bridge a little cheaper (hoping price drops will come over the time waited and if the consumer is lucky price drops that come with/after bulldozer).
     
  17. cheezburger

    cheezburger New Member

    Joined:
    Sep 6, 2010
    Messages:
    265 (0.17/day)
    Thanks Received:
    11
    no surprise. they are try to fix the single thread performance hit due to the smaller l1 data/instruction. each core "only" had 8kb l1 data while the instruction cache is share by module which just only 64kb "2 way" in cache(could have be less...i think...) which is roughly 40kb per core compare to core's 64kb per core. big disadvantage. so all they can do is add more l3 cache to increase the performance or hoping not drop performance without tweak too much on the exist architecture that had been tape out and going to be release in 3 months. same thing intel did when realized northwood its poor l1 cache will drag down performance they increase l2 cache from 256kb to 512kb. however orochi is 8 module 16 core processor so featuring 16mb l3 meant each core can use up to 1mb l3. still way below nehalem's 2mb per core. also unlike intel's architecture amd's cache heavily determine by the stage pipeline. lower stage pipeline won't take advantage on bigger cache. but since bulldozer will featuring 4+ghz i doubt this will be at least 20+ stage pipeline in this processor. but despite all these feature as long as intel decide to increase ivy bridge's l2 cache from 256k per core to 512k per core amd will experience same horror they faced when core 2 came out.
     
    Completely Bonkers says thanks.
  18. HTC

    HTC

    Joined:
    Apr 1, 2008
    Messages:
    2,246 (0.91/day)
    Thanks Received:
    303
    I wonder how hot these CPUs will get ...
     
  19. ROad86 New Member

    Joined:
    Sep 24, 2010
    Messages:
    21 (0.01/day)
    Thanks Received:
    5


    First orochi is 4 module - 8 core design. Second not only the size but how fast is the cache. Third it is very important how the prediction of instructions will work, if the design is good then you dont need big L1 cache which increase cost and die size. And yes 2mb per module 1 mb per core is the amount that bulldozer will have.
     
    WarEagleAU says thanks.
  20. mechtech

    mechtech

    Joined:
    Dec 26, 2006
    Messages:
    251 (0.09/day)
    Thanks Received:
    18
    I want one, a server version with 8 or 16 GB of ecc ram :D I don't know why though since I don't even work 1 core on my 955BE
     
  21. cadaveca

    cadaveca My name is Dave

    Joined:
    Apr 10, 2006
    Messages:
    14,131 (4.45/day)
    Thanks Received:
    7,330
    Location:
    Edmonton, Alberta
    Very hot...apparantly we'll see a clockspeed decrease(which I assume is due to the high levels of cache), but IPC will increase. I'm kinda expecting 2.4ghz or so...maybe lower...for launch chips.
     
  22. bear jesus

    bear jesus New Member

    Joined:
    Aug 12, 2010
    Messages:
    1,535 (0.96/day)
    Thanks Received:
    200
    Location:
    Britland
    Just a good reason for me to get my first real water cooling setup :D (assuming i am happy with the reviews of bulldozer)
     
  23. cadaveca

    cadaveca My name is Dave

    Joined:
    Apr 10, 2006
    Messages:
    14,131 (4.45/day)
    Thanks Received:
    7,330
    Location:
    Edmonton, Alberta
    I don't know anything about it, really. However, there is mention of the clockspeed decrease on the AMD blog site. NOw that we have the info on cache size...1+1=2. Of course, there's lots of time between now and launch..seems to me they are refining the process, and a few bugs, at this point.
     
  24. ROad86 New Member

    Joined:
    Sep 24, 2010
    Messages:
    21 (0.01/day)
    Thanks Received:
    5

    Haha me too!!! :laugh:
     
  25. bear jesus

    bear jesus New Member

    Joined:
    Aug 12, 2010
    Messages:
    1,535 (0.96/day)
    Thanks Received:
    200
    Location:
    Britland
    Hmm i wonder if they will follow intel's lead (refering to the cooler that comes with the top end i7's) by using a better cooler for the high end cpu's if they run hot, would be nice to see a better cooler than the current one's as i am not really a fan of them.
     

Currently Active Users Viewing This Thread: 1 (0 members and 1 guest)

Share This Page