Thursday, March 22nd 2012

AMD A10-5800K "Trinity" APU Tested

Later this year, AMD will unveil its second-generation accelerated processing units (APUs) in the FM2 package, based on its brand-new "Piledriver" CPU and "Graphics CoreNext" GPU architectures. Among these, the part that is designed keeping overclockers in mind is the A10-5800K, which features an unlocked base clock multiplier, four x86-64 cores, 3.80 GHz (nominal) and 4.20 GHz Turbo Core clock speed, and AMD Radeon HD 7660D graphics. Find out more about the lineup here.

INPAI got its hands on an A10-5800K APU, and supporting socket FM2 motherboard, and wasted no time in comparing it to the current-generation A8-3850. INPAI put the two chips through SuperPi 1M, to measure single-thread performance, and 3DMark 06, to measure embedded-GPU performance. In SuperPi, A10-5800K crunched SuperPi 1M in 23.775 s, the A8-3850 did the same in 26.039 s. With 3DMark 06, the A10-5800K scored 9396 points, while the A8-3850 scored 6223. The inference that can be drawn out of this little test is that Trinity has significantly faster graphics, not so much CPU (taking into account A10-5800K cores were clocked over 30% higher than those of the A8-3850).

Source: INPAI
Add your own comment

75 Comments on AMD A10-5800K "Trinity" APU Tested

#1
Aquinus
Resident Wat-man
by: Andy77
Where did you learn to count bytes? Those are 8 Mega Bytes of RAM used by SuperPI to make all the calculations possible. It even says "Allocated", which means a chunk out of the total given for the application to use.



It's backwards...
A10, CPU-Z x32, 2,5 GB RAM
A8, CPU-Z x64, 7,6 GB RAM
SuperPi is also a 32-bit application, so it can only see up to 2gb of memory, which is exactly how much memory SuperPi sees in that case. There is something else going on here though. If trinity is running 32-bit OS, the GPU could be mapping 1.5gb for video which would only leave 2.5gb for everything else. Then SuperPi once again, only sees 2gb because it is still 32-bit.
Posted on Reply
#2
Super XP
by: meirb111
you didnt read here is quote: not so much CPU (taking into account A10-5800K cores were clocked over 30% higher than those of the A8-3850).
We will never know until the product is properly reviewed and released. Until then, this is all speculation.
AMD did state a 30% CPU performance improvement with Trinities Piledriver cores versus current LIano. The GPU should offer about 50% to 60% says AMD.
Posted on Reply
#3
Andy77
by: Aquinus
SuperPi is also a 32-bit application, so it can only see up to 2gb of memory, which is exactly how much memory SuperPi sees in that case. There is something else going on here though. If trinity is running 32-bit OS, the GPU could be mapping 1.5gb for video which would only leave 2.5gb for everything else. Then SuperPi once again, only sees 2gb because it is still 32-bit.
I was hinting that the user saw an 8-ish figure and guessed it's the total amount of system RAM when those are only 8MB of memory used by the app. It's easy to mistake bln with mln, and end up to GB when thinking about it.

As to your suggestion, here's my competition: that 32 bit system actually has only 2GB of RAM. 3Dmark sees 2GB, SupertPI sees 2GB... it's not hard to figure Trinity ran on 2GB of RAM.

Oh, and UMA can't be larger than 512MB. The BIOS/UEFI doesn't allow it and by default it takes up 256MB or 512MB if large amounts of RAM is detected. So... no, it can't share 1,5GB for video.

On the 64 bit system... SuperPI isn't able to detect the amount of RAM, mainly because the variable that stores that value is of int(32) type and its max value unsigned is 4,29 bln, when it needed a double to store the value of 8 bln bytes, the amount of RAM the Llano system has. It has nothing to do with actual type of app, even if it's 32-bit, it still relies on the os core functions to get those values.
Posted on Reply
#4
Aquinus
Resident Wat-man
by: Andy77
As to your suggestion, here's my competition: that 32 bit system actually has only 2GB of RAM. 3Dmark sees 2GB, SupertPI sees 2GB... it's not hard to figure Trinity ran on 2GB of RAM.
Then why does 3D Mark say the system has 2.5gb of memory? Maybe 3Gb with 512Mb used for video?

by: Andy77
Oh, and UMA can't be larger than 512MB. The BIOS/UEFI doesn't allow it and by default it takes up 256MB or 512MB if large amounts of RAM is detected. So... no, it can't share 1,5GB for video.
Did not know that, I thought it allowed for more. However it can share 1.5gb of memory, it just doesn't use memory-mapped I/O for that if you're running a DX application. It will swap pages in and out of video memory if there isn't enough space. That is why if you look at DXDiag, you will notice that the "available video memory" will exceed the amount on (or in this case, allocated to) the video card.

Edit: Here, this is what I mean. Keep in mind my 6870s have only 1gb of on-board video memory and before someone says that is how much system memory that is available, I had 16Gb with something like 13gb or 12gb free. The same thing happened when I only had 8gb on my last build.
Posted on Reply
#5
Dent1
by: meirb111
you didnt read here is quote: not so much CPU (taking into account A10-5800K cores were clocked over 30% higher than those of the A8-3850).
I saw the quotes. But there is no evidence to sugguest the A10-5800k wouldnt of performed similarly at a lower clock.
Posted on Reply
#6
Andy77
by: Dent1
I saw the quotes. But there is no evidence to sugguest the A10-5800k wouldnt of performed similarly at a lower clock.
BD/PD need higher clock to achieve normal IPC... but this also makes the clock by clock point useless. Because Husky has internally a shorter instruction pipeline compared to PD, it's normal for the chip to be clocked lower. Try to clock it PD levels and see instability issues. Not the same for PD. Because its pipeline is longer it can handle a higher clock by default to achieve the same performance. The main difference is that internally, the more instructions you give a Husky to do at one time the more it will choke, while on PD, because of its longer pipeline, it will handle "crowded" situations better.
Posted on Reply
#7
Vulpesveritas
by: Edgarstrong
Do you think I can use this new APU in a HTPC that will be used for Blu-Ray videos most of the time and skip graphics card?
It's overkill, but yes you can. You can also probably use a number of emulators without issue given the GPU.
by: meirb111
you didnt read here is quote: not so much CPU (taking into account A10-5800K cores were clocked over 30% higher than those of the A8-3850).
There may be diminishing returns on performance / clock with trinity, we don't know.
Also, if this is real, it's pre-production silicon and therefore is unlikely to perform as well as what we'll see in retail.

by: Sihastru
No one is complaining for the lack of "features". But there are a few problems. It's not really any better then BD, there's a sizeable clockspeed difference there that explains the slight boost in performance (and not in a flattering way).

As for GPGPU, I'm still waiting for something useful other then video encoding to leverage it. If you're telling me that it will offload x86 instruction set to GCN, I'm gonna start laughing. Do you even believe that or you just wanted to make a short list longer?

I will attack even the "unlocked" part, at a turbo speed of 4.2 GHz, overclocking isn't really needed.

We'll talk pricing when I see it on the shelves. Lately AMD doesn't have a good track record when it comes to pricing.

It does however have a really nice GPU in it. It's its only saving grace.
Umm... if it is GCN, then this; http://en.wikipedia.org/wiki/Southern_Islands_(GPU_family)
"Support for x86 addressing with unified address space for CPU and GPU"
http://www.anandtech.com/show/4455/amds-graphics-core-next-preview-amd-architects-for-compute/6
"In terms of base features the biggest change will be that GCN will implement the underlying features necessary to support C++ and other advanced languages. As a result GCN will be adding support for pointers, virtual functions, exception support, and even recursion. These underlying features mean that developers will not need to “step down” from higher languages to C to write code for the GPU, allowing them to more easily program for the GPU and CPU within the same application. For end-users the benefit won’t be immediate, but eventually it will allow for more complex and useful programs to be GPU accelerated."
Posted on Reply
#8
Andy77
by: Aquinus
Maybe 3Gb with 512Mb used for video?
That might be it.


by: Aquinus
However it can share 1.5gb of memory, it just doesn't use memory-mapped I/O for that if you're running a DX application.
It can share, but I believe that is done dynamically. When using standard applications, and 3DMark is partially just that, the shared RAM is included in the main system RAM displayed while the "dedicated" RAM is subtracted.

Anyway, by the looks of it, it's 3GB vs 8GB of TOTAL system RAM. I don't know if 3DMark would be affected by it, it's still a lot of RAM for just one benchmark application and it still has 512 MB of exclusive RAM for video.


by: Vulpesveritas
For end-users the benefit won’t be immediate, but eventually it will allow for more complex and useful programs to be GPU accelerated."
What I was thinking since I first saw the specs pop up in the wild... imagine if consoles will use GCN. It will allow devs to code better engines for all platforms and get rid of the "it's a cosnole port" stench.
Posted on Reply
#9
trickson
OH, I have such a headache
WOW that is SWEET!! Really looks great! Got to say that AMD Has one hell of an APU there! WOW fantastic!
Posted on Reply
#10
faramir
by: Andy77
BD/PD need higher clock to achieve normal IPC...
w00tL0L ??? Spare us the nonsense if you don't know what you're talking about. Thank you.
Posted on Reply
#11
Dent1
by: Andy77
BD/PD need higher clock to achieve normal IPC.
What is a normal IPC? I googled "Normal IPC" and I couldnt find anything!

Who are you to decide what a "normal IPC" is?
Posted on Reply
#12
trickson
OH, I have such a headache
by: faramir
w00tL0L ??? Spare us the nonsense if you don't know what you're talking about. Thank you.
by: Dent1
What is a normal IPC? I googled "Normal IPC" and I couldnt find anything!

Who are you to decide what a "normal IPC" is?
Can we just stay on topic for once? Please lets stop all this stuff.
I think this is really sweet! Really! AMD has one kickass APU! Fantastic job AMD! :rockout:
Posted on Reply
#13
Vulpesveritas
So I'm wondering how it will fare with 2133mhz RAM as it's memory controller is supposed to support it, and AMD should be coming out with some low profile heatsink 2133mhz Radeon branded RAM before Trinity comes out.
Then too... wondering whether it is VLIW4 as originally said or GCN as it would appear to be now with the "384 radeon cores" statement which would suggest GCN, especially given the A8-3850 has 400 VLIW5 shader units.
Also I wonder how it overclocks as given it's 15% more energy efficient than Llano it would appear.
Posted on Reply
#14
JMccovery
by: Vulpesveritas
Look at the allocated memory though. Seems odd.
Oh speculation quick- the latest rumors i've seen are saying the A10 will have 384 Radeon cores at 800mhz. I don't see how that could be VLIW4, especially as I don't see how they could run hybrid Xfire with it.

However if those values are correct, then it could easily be GCN, given it would be the same clock as a 7750, with 75% of the cores, and with 2133mhz RAM you would have the same memory bandwidth.

Still doesn't explain the L1 cache, and wondering how TDP-TDP it does against llano with it using ~15% less power. Lower voltage.
Looks like IPC is almost that of Llano, though not quite. But wondering again on that L1...
...
so it may be real. not sure.
I think it is VLIW4. Evergreen, Barts, Turks, Caicos and Llano were VLIW5: a shader block or 'core' contained 16 5-way ALUs (80); whereas with Cayman, a 'core' contains 16 4-way ALUs (64); which Trinity has 6 (6x64=384).
Posted on Reply
#15
Vulpesveritas
by: JMccovery
I think it is VLIW4. Evergreen, Barts, Turks, Caicos and Llano were VLIW5: a shader block or 'core' contained 16 5-way ALUs (80); whereas with Cayman, a 'core' contains 16 4-way ALUs (64); which Trinity has 6 (6x64=384).
The thing that i don't understand about that is what the heck could it xfire with? Not to mention i'm uncertain how we would be seeing a near 50% increase in graphics performance using 384 vliw4 shaders vs 400 vliw5 shaders.

Also, look at Southern Islands, specifically Cape Verde- the 7750 has 512 cores. 384 is exactly 75% of 512, and they're clocked at 800mhz, the same as the cores in the 7750. Not to mention the similarity in the memory data rate - using the 2133mhz memory controller the memory bandwidth would be the same between the two graphics processors. And given that it would easily xfire with the 7750.

Also remember Trinity was originally slated for q1 2012 release - it's been pushed back, and not for manufacturing issues. If I were to guess, AMD would push GCN for improved performance overall and push 'fusion' faster.
Posted on Reply
#16
Dent1
by: trickson
Can we just stay on topic for once? Please lets stop all this stuff.
I think this is really sweet! Really! AMD has one kickass APU! Fantastic job AMD! :rockout:
Well I don't see you quoting Andy77 whom steered the topic away with his IPC talk.

And yes fantastic APUs indeed.
Posted on Reply
#17
mastrdrver
The GPU is VLIW4. If we assume that even the Trinity CPU part is slower then the Llano CPU part, it is encouraging that the GPU score was increase by so much. This overall look though does bode well for the rumor that 17w Trinity equals 25/35w Llano laptop parts.

I wish the IB 3DM 06 scores had been leaked in the other thread.

by: Andy77
BD/PD need higher clock to achieve normal IPC...
That is not a fact but an assumption. Since they did not clock Llano and Trinity the same, it's impossible to tell if the IPC went up, down, or stayed the same......or as you put it achieved "normal IPC". What ever that means.
Posted on Reply
#18
Drac
How much faster than a hd 5770 would it be? I just want to replace my entire system and it would be good to forget to buy a new gpu in this times.
Posted on Reply
#19
Vulpesveritas
by: Drac
How much faster than a hd 5770 would it be? I just want to replace my entire system and it would be good to forget to buy a new gpu in this times.
"faster than an HD 5770?" given that the llano GPU is slightly slower than a 6570... and that this is considered a 7660, it will be a slightly slower GPU than the 5770. And if it's VLIW4 it probably won't be able to do any xfire either.
If it's GCN though you get to xfire it with a 7750. lol
Posted on Reply
#20
JMccovery
by: Vulpesveritas
The thing that i don't understand about that is what the heck could it xfire with? Not to mention i'm uncertain how we would be seeing a near 50% increase in graphics performance using 384 vliw4 shaders vs 400 vliw5 shaders.

Also, look at Southern Islands, specifically Cape Verde- the 7750 has 512 cores. 384 is exactly 75% of 512, and they're clocked at 800mhz, the same as the cores in the 7750. Not to mention the similarity in the memory data rate - using the 2133mhz memory controller the memory bandwidth would be the same between the two graphics processors. And given that it would easily xfire with the 7750.

Also remember Trinity was originally slated for q1 2012 release - it's been pushed back, and not for manufacturing issues. If I were to guess, AMD would push GCN for improved performance overall and push 'fusion' faster.
I don't think backporting the GCN design to the 32nm process would be feasible, it would be 'easier' (for AMD not GF) to shrink a Northern Islands derivative to 32nm. GCN will make its APU debut in Kaveri, Kabini and Tamesh later next year.
Posted on Reply
#21
Vulpesveritas
by: JMccovery
I don't think backporting the GCN design to the 32nm process would be feasible, it would be 'easier' (for AMD not GF) to shrink a Northern Islands derivative to 32nm. GCN will make its APU debut in Kaveri, Kabini and Tamesh later next year.
So was the original plan. But also in the original plan was that Piledriver would have a 10 core processor aka komodo and AMD decided to change that. Simple thing is we won't know 100% for sure until AMD releases the processor. I hope that it is GCN so we can xfire it, because otherwise I don't see quite as much in the way of value compared to the last generation, especially when the desktop prices drop, when we can get a used 6570 and xfire it with a 3870k and pretty much get the same performance for less. Sure trinity is more power efficient but eh.
Posted on Reply
#22
mastrdrver
Changing from 10 cores to 8 is orders to magnitude easier then porting a architecture designed for bulk over to SOI.

If want to use the car analogy, it would be similar to adding another engine option to a car and say that is just as easy as moving the car to an entirely different facility for assembly one of which it was never designed to go down. It's possible, but would take a lot of engineering resources to accomplish.
Posted on Reply
#23
Drac
by: Vulpesveritas
"faster than an HD 5770?" given that the llano GPU is slightly slower than a 6570... and that this is considered a 7660, it will be a slightly slower GPU than the 5770. And if it's VLIW4 it probably won't be able to do any xfire either.
If it's GCN though you get to xfire it with a 7750. lol
I got a new hd 5770 for 140 euros almost 2 years ago, i want to spend more or less the same money for a new card that is more efficient and faster than the 5770, but its impossible in this days, and i wont spend 150 euros again after 2 years for the same performance, taking this cpu+gpu was my hope to do the trick :(
Posted on Reply
#24
MikeMurphy
by: Drac
How much faster than a hd 5770 would it be? I just want to replace my entire system and it would be good to forget to buy a new gpu in this times.
MUCH MUCH SLOWER than a 5770. Your 5770 is still a really good card. Keep it.

I recently added a 5750 to my a8-3850. The difference is substantial.
Posted on Reply
#25
MikeMurphy
by: Andy77
Oh, and UMA can't be larger than 512MB. The BIOS/UEFI doesn't allow it and by default it takes up 256MB or 512MB if large amounts of RAM is detected. So... no, it can't share 1,5GB for video.
This is wrong. The video memory is typically user selectable. My a75m-ud2h was set to 1gb.
Posted on Reply
Add your own comment