Friday, January 20th 2012
AMD Vishera Packs Quad-Channel DDR3 IMC, G34 En Route Desktop?
AMD might be a little sore that its "Zambezi" FX processor family based on its much-hyped "Bulldozer" architecture didn't quite meet the performance expectations of a ground-up new CPU architecture, but it doesn't want to take chances and build hype around the architecture that succeeds it. From various sources, some faintly-reliable, we have been hearing that the next-generation of high-performance desktop processors based on "Piledriver" architecture, codenamed "Vishera", will pack five modules or 10 cores, and will be structured essentially like Zambezi, since Piledriver is basically a refinement of Bulldozer architecture. The latest leak comes from the Software Optimization Guide for AMD 15h family (read here), which was picked up by CPU World while most of us were busy with CES.
CPU World compiled most of the features of what it suspected to be AMD referring to its future processors based on the Piledriver architecture, that's "Vishera" (desktop high-performance), "Terramar" (high-density server), and "Sepang" (small-medium business server) parts. The three are not the first chips to be based on Piledriver, AMD has a new mainstream desktop and notebook APU in the works codenamed "Trinity", which is en route for a little later this year. Trinity basically has an identical CPUID instruction-set as Vishera, Terramar, and Sepang, confirming their common lineage compared to today's "Bulldozer" architecture. The most catchy detail is of Vishera featuring 4 DDR3 channels.The plot thickens where "HyperTransport Assist feature" is listed as being supported on Vishera. HT Assist is a feature found on AMD's enterprise socket G34 processors, which facilitates better inter-die communication between the two dies of a typical socket G34 Opteron processor. The G34 (LGA1972) package is a multi-chip module of two quad-core, six-core, or four-module dies, which combined have four DDR3 memory channels, and a number of HyperTransport links to communicate with neighbouring sockets and the system's chipset. Could this be the first indication that AMD wants to take on Intel LGA2011 HEDT (high-end desktop) using Vishera chips in the G34 package? It will be a while before we find out.
Apart from using common silicon between client and enterprise platforms, AMD does have a history of colliding the two.
Source:
CPU World
CPU World compiled most of the features of what it suspected to be AMD referring to its future processors based on the Piledriver architecture, that's "Vishera" (desktop high-performance), "Terramar" (high-density server), and "Sepang" (small-medium business server) parts. The three are not the first chips to be based on Piledriver, AMD has a new mainstream desktop and notebook APU in the works codenamed "Trinity", which is en route for a little later this year. Trinity basically has an identical CPUID instruction-set as Vishera, Terramar, and Sepang, confirming their common lineage compared to today's "Bulldozer" architecture. The most catchy detail is of Vishera featuring 4 DDR3 channels.The plot thickens where "HyperTransport Assist feature" is listed as being supported on Vishera. HT Assist is a feature found on AMD's enterprise socket G34 processors, which facilitates better inter-die communication between the two dies of a typical socket G34 Opteron processor. The G34 (LGA1972) package is a multi-chip module of two quad-core, six-core, or four-module dies, which combined have four DDR3 memory channels, and a number of HyperTransport links to communicate with neighbouring sockets and the system's chipset. Could this be the first indication that AMD wants to take on Intel LGA2011 HEDT (high-end desktop) using Vishera chips in the G34 package? It will be a while before we find out.
Apart from using common silicon between client and enterprise platforms, AMD does have a history of colliding the two.
229 Comments on AMD Vishera Packs Quad-Channel DDR3 IMC, G34 En Route Desktop?
2)The i7-3820 is socket 2011, not 1155; it is comparable to the i7-2600K\2700K (LGA 1155) in performance. If you wanted to wait for much better performance, in an upgrade, you were better off spending some more money and buying a LGA 2011 board and an i7-3820, then waiting for Ivy Bridge Extreme chips to be released 10-11 months from now.
3)As of now, Piledriver is the best you can hope for on AM3+, and it will not be a massive performance improvement, maybe +25% at equivalent power consumption (at most). There is no guarantee that Steamroller will be released as a discrete non-igpu cpu for AM3+, AMD has not released its roadmap for the future beyond Piledriver on AM3+. Piledriver may be the last discrete consumer cpu manufactured by AMD --- after that it might only be APUs, and I think it'll be a while before the process node is small enough to fit 3 cpu modules onto an APU, so it'll be quad core --- unless the fusion thing gives a huge overall performance boost.
seriously.....you bought what makes you happy....all that matters;)
Also, for converting my 1080 60i video and bluray authoring, my rig is great. I put the source file on an hdd and output file on an ssd and cpu load runs 60-90%. Minute for minute is the slowest conversion i've seen and 10 to 1 for Hidef to Stdef. I just had a baby girl so lots of videos! She wakes up a lot at night so gaming keeps me awake for the 1st shift of feedings.
I am well aware of the intel sockets..... Think outside the Intel box man. 1155 has little chance to double the 2500k performance (for $200ish cpu) but 2011 performance currently does and will hopefully go far beyond that. I want to be able to double my performance with a drop in cpu upgrade someday for $200. And basically 2011 is too expensive and 1155 may be maxed out with Ivybridge. So I didn't go with intel. AMD in 2-3 years, i'm hoping, has a $200ish cpu that will double the fx-6100. RCM, more cores (I still think 10 core am4 is the direction) and yes I surrender, higher IPC sets the stage.
As far as heat goes on my fx-6100, prime 95 for 3 min is the only thing, thus far, that puts it above 60 C at 4.7ghz. It runs warm at high clock speeds on air, but not super hot. I've had it to 74C before crashing with prime 95 @ 5.2. I can get errors above 4.7 on prime 95 but not overheating until 5.2 or heavy voltage.
my head specs are as follows:
4 lobe graymatter (low IPC)
hyperthreading left/right brain
35 solar passes of memory
turbo caffeine 2x (i like espresso)
opsys: UN of WIS
For me price/performance matters, which is why I have a FX-8120 OC'ed to 4.40 GHz with a 8-cores. Looking at the numbers I would call that a FX - 8190 or something.
Don't get me wrong, Bulldozer is a complex piece of work, something AMD's past CEO dreamed about one night after having a few beers :D
Good on AMD, because now they've somewhat developed a modular based design that "WILL" only get better and better with time.
www.kitguru.net/components/cpu/zardon/power-consumption-fx-8150-v-i5-2500k-v-i7-2600k/
My settings with C1E and core parking on, clock me down from 4.7 to 1.5 and park 5 of 6 cores when they not being used. Still it's obvious that any cpu usage takes more watts on FX then sandybridge. For me it's not a huge concern, I maybe use like 10-20 cents a week (1-2 KWhours)
On a global scale It counts. Well, unless you consider all the wasted Intel Mobo's sitting around cause sockets change so much. They take power and resources to make too. So Intel, I think may save $10-$30 bucks in power depending on how much it's used, But people who have to upgrade their motherboard with CPU, waste as well. Can't really say what's better for mother earth.
On Servers this is a big deal. The power is takes to maintain server cpu loading is as big of a concern to the IT industry as gasoline prices are to you and I. I'm not all that up on 32nm server cpu power consumption. Anybody ???
I see the Intel x5690 is rated at 139w while the AMD 6274 is rated at 115w and have similar passmark scores. A real life test would be best to consider which architecture is more power efficient on servers.
Super XP Read this page on RCM for Piledriver and tell me what you think
www.cyclos-semi.com/technology/
Don't get me wrong, I hope Piledriver is good, it would save me money on future upgrades if all I had to do was plop a PD into my board.
www.techpowerup.com/162843/AMD-A10-5800K-quot-Trinity-quot-APU-Tested.html
If FX does see 30% higher clocks with 15% less power like the A10-5800k, It would be HUGE for AMD. The fx-8350 performance would surpass the i7-3770 by 5% or so on passmark but use 25 more watts. Overclocking would be about 5.3 on air and 5.8 on water.
Apparently the piledriver trinity chip with RCM is a milestone in a "Tock" release WOW. I just wish the OS and memory specs were the same and the testing was more extensive.
as far as i know phenom II has 60-70% the ipc of SB(40%) slower
while BD has 90%ipc of phenom II/stars
this is why in some cases SB would perform 160% the performance of bulldozer when running around the same clock speed now if piledriver truly is 20% faster than stars clock-clock then it should sit at around 80% the ipc of SB which would mean SB would perform 15-25% faster
but amd promised 29% better x86 performance than llano in general and not clock-clock
and that is usualy a best case scenario if you know amd marketing, so knowing they relied on clock speed its hard to compare different skus and efficiency because clockspeed and efficiency dont scale, meaning PD would be much more efficient in lower tdps than at the higher end
however if amd was comparing the fastest trinity with the fastest llanno then it makes alot of sense and its safe to assume that llano and PD have the same IPC, because an a8-3870k has 3.0ghz clockspeed, and the A10-5800k has a higher clock of 4.2ghz, exactly 29% faster clockspeed ;) instruction per CYCLE is being too generalized in my opinion as it doesnt tell real world performance, like bulldozer for example if looking at its hardware it should do 4 instructions per cycle vs 3 in phenom as each module has more hardware(ex: 4decoders vs 3in stars), however each cycle is longer than that of stars or SB due to its higher latency
and its designed that way so the shared resources can have enough time to feed data for 2 cores, meaning while one core is crunching on data, the other integer core would be getting fed from the shared resources
the latency was higher than expected tho as i believe, amd pretty much worked around that i believe(or thats what seems to be) by either shortening the cycles or allowing more entries(which is increased according to this chart ive seen, L1data became 64 from 32)
This also means just because BD-based CPU's could go from like 3GHz to 5GHz, doesn't mean the PD-based CPU's coming out at 4GHz are going to go to like 6GHz. The fact that the PD-based CPU's are launching at 4GHz is in fact the performance gained by RCM. In all likelyhood PD-based CPU's will clock just as high as BD-based CPU's--maybe a tiny bit higher--while using less power. So it's definitely a win-win, but it's not some magical solution that's going to add 5-10-15-20% real world performance, it just allows for higher clock speeds, which generates additional performance.
www.xbitlabs.com/news/cpu/display/20120429101741_Trends_of_12_High_Performance_PC_Intel_Platform_Remains_Unchallenged.html
look at the chart, ivy bridge pretty much keeps beating the 8150 with 105%-180% the performance of an 8150(5%-80% faster)
not to mention many of those benchmarks are also multithreaded and thats were the gap is smaller
but in lightly threaded apps that can only use like 3 cores the bulldozer cant even turbo properly, im assuming thats were ivy bridge sees a good 80% performance over bulldozer
so its safe to say a bulldozer has 60% the performance of a SB/IB core in general(excluding the situations were bulldozer excels in new instruction sets and so on) but since it has more cores it ends up close to it in multithread
now here is were i even confuse myself, bulldozer having 60% the performance of SB does NOT mean sb is 40% faster its actualy more, i got confused when i first looked at the graph but it makes sense now
because if 60%(bd)-->100% then 100%(sb) --> X
if you cross multiply you end up with SB having 166.6% the performance of BD (100%) in single thread
so if piledriver is 30%faster than bulldozer in single thread, its still 30% slower than SB/IB
however it will definitely have an edge in multithread against the I7's if thats the case
so its gonna be way more competitive than an 8150 thats for sure
In most situations the difference between DDR3-1600 and DDR3-2133 is within the standard deviation. For that matter, you can see that unlike Llano, going from average memory to higher-end memory doesn't yield sizable performance gains. If you check the linked review in the article, it actually shows exactly what speed RAM they used, with timings; www.xbitlabs.com/articles/cpu/display/core-i7-3770k-i5-3570k_4.html#sect0
2. Why on Earth would it matter what GPU they are using if the CPU is what they are testing? If you look at the gaming tests, they did all largely CPU bound games. The best take away is Metro 2033, which is incredibly well threaded, and you can see that the low IPC of BD causes it to drop behind Intel, despite it having access to more threads at a time. I've read a lot of recent trends that indicate unless overclocked, FX Processors are already starting to bottleneck even single GPU solutions...
But like I said, at home, 2600k@4.0ghz till long after 4th Gen Ivy Bridge. No really reason to upgrade the system, GPU maybee, but not the proc.
Even and doesn't claim that piledriver will compete with Intel, look at the advertising they are hyping about, its all about graphics with little to no mention about CPU because even if they made pd 40% faster than bulldozer its still a tad bit on par or slower than Intel's i7
Also 2 similar GPU's, one Amd and one Nvidia, (preferably 4) should be cross referenced on each platform for gaming benchmarks to get a decent cpu performance comparison.
Back to the subject of Piledriver With the new instruction sets, HT assist, 10% FPU queue load increase, lower latency on certain instruction sets and several other improvements, Piledriver won't just have an increase in clockspeed and lower TDP, several apps will see a much bigger improvement then the clock cycle and IPC enhancements. This article from AMD has all the programming improvements for Piledriver and it's huge for a "tock" release. 361 pages huge.
support.amd.com/us/Processor_TechDocs/47414_15h_sw_opt_guide.pdf
Some more fun reading ;)
support.amd.com/us/Processor_TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf