• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Benchmarks Surface for AMD Ryzen 4700G, 4400G and 4200G Renoir APUs

Joined
Feb 20, 2019
Messages
7,309 (3.86/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
The main limiting factor for APU performance is memory bandwidth and not the CU count. DDR5 might help there but it will still be way less than current mid or high end graphic card.

PS4/5 and Xbox one * have way more memory bandwidth than any pc APU.
For 2000 and 3000 series APUs, the CU count was bottlenecked by most vendors pairing it with DDR4-2400. Two channels of DDR4-2400 were enough of a limit that moving from 8CU to 10CU or even 11CU didn't scale well.

They added CUs were faster, but just not fast enough to matter.

DDR4-3200 seems to be the baseline for Renoir designs, so there's 50% more bandwidth already and if you look at the Athlon 3000G with only 3CUs, it was very definitely short on CUs. Even in single-channel mode, that thing couldn't use the pathetic bandwidth it was given because 3CUs isn't enough, and adding more RAM or faster RAM did very little for the 3000G.

I'm guessing that 6CU configurations or Renoir will not be bandwidth starved like Vega10 and Vega11 were, and that Renoir's Vega8 with LPDDR4X will be entirely limited by the small number of CUs.
Budget gamers and laptop users with decent cooling will wish there was a Vega12 or Vega15 offering, as was originally rumoured. Compared to the rest of the APU, the actual CUs are tiny, so downgrading from 11 in Pinnacle Ridge APUs to 8 in Renoir is a deep cut that hasn't really reduced die area by all that much. False economy IMO but only time will tell if AMD correct that mistake in the 5000-series or abandon higher GPU capabilities in the APUs going forward.
 
Joined
Dec 18, 2015
Messages
142 (0.05/day)
System Name Avell old monster - Workstation T1 - HTPC
Processor i7-3630QM\i7-5960x\Ryzen 3 2200G
Cooling Stock.
Memory 2x4Gb @ 1600Mhz
Video Card(s) HD 7970M \ EVGA GTX 980\ Vega 8
Storage SSD Sandisk Ultra li - 480 GB + 1 TB 5400 RPM WD - 960gb SDD + 2TB HDD
People buy AMD APUs just so they can play games casually without buying discrete gpu. AMD can get away with not including iGPU at all in their processors, because those that buy them get a good graphics card anyway. But APU user expect decent gaming performance out of the box.

Yeah, I also think Renoir is only an upgrade to laptops due to space limitations and TDP, and maybe for ultra-compact office PCs. For a regular desktop, AMD would impress with more graphic power... I hope the extra clock and improved memory controller (4266Mhz+) will counterbalance the reduction of shaders.
 
Joined
Oct 12, 2005
Messages
682 (0.10/day)
I don't think you are correct and what you said is a bit of a stretch. CUs are just as important as memory bandwidth. You can't tell me that making the memory twice as fast would result in doubling the APU graphics performance? You would still need more CUs anyway to achieve that. It is more of a balance kinda thing then just memory bandwidth. Besides I'm sure there is no way that the APU will be able to pull of a high end graphics with 10 CU, only by increasing mem bandwidth to ridiculous speeds. You still need processing power anyway. Besides this APU = discrete graphics (even mid range) is not happening anytime soon or even ever.
Looking at a console side of things. You should not compare any console to a PC.

You miss the point. Right now they do not have issue to put more CU in a chip. They have issue getting more bandwidth. If they had the bandwidth, it would make sense to put more CU. There is no point of putting more CU if they end up bandwidth starved.

It's true that the bandwidth increase with DDR4-3200 but the new SKU also have the double the cores. And a dual channel DDR4-3200 is just 50 GB/S where a Radeon Rx 560 or Geforce 1050 have about 112 GB Dedicated. So not only an APU struggle to get half the bandwidth, they have to share it with 6 or 8 core CPU.

They reduced the CU count because they knew they could increase the frequency and give aproximativelly the same performance that is anyway, limited by the available bandwidth.

We will see larger APU when the memory bandwidth will increase. But unless they use on chip memory, i doubt we will see very large chip. Also the market for large APU outside console is very niche and probably not worth the investment.
 
Joined
Feb 20, 2019
Messages
7,309 (3.86/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
They reduced the CU count because they knew they could increase the frequency and give aproximativelly the same performance that is anyway, limited by the available bandwidth.
Reviews on current Renoir laptops show that the higher clocks decimate the power consumption. All three reviews I've read have throttling problems under GPU load and the boost clocks are too short-lived to be useful, thanks to the ridiculous frequencies blowing through the STAPM budget in under five minutes. Honestly, GPU performance after 10 minutes is worse than the 3700U and 3500U that renoir is supposed to be replacing, because those clocks are only of any use for short benchmark runs.
We will see larger APU when the memory bandwidth will increase.
Bandwidth has increased 78% - that's 2400MT/s to 4266MT/s with new LPDDR4X. What more do you want?!
 
Joined
Oct 12, 2005
Messages
682 (0.10/day)
AMD made the choice to maximise the CPU performance at the expense of the integrated GPU because they think people will use dedicated GPU anyway if they require high GPU performance.

LPDDR4X at 4266MT/S cost a fortune. Laptop are shipping with PC-3200 instead and that is way less than 78%

On Desktop, you could buy a high speed kit, but for the extra cost, you better just spend it on a dedicated GPU and you will always get more performance.

There are rare scenario where you would want a very small form factor so the external GPU might not be possible, but these are rare and you don't build a lineup for special case.

At least not for now.
 
Joined
Feb 20, 2019
Messages
7,309 (3.86/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
But the 4800U is a premium variant, and that DOES come with LPDDR4X

If you're talking about the 4700U and below that are more commonly seen with DDR4-3200, then those are only 7CU variants. They've lost 30% of their CUs and gained 50% bandwidth by going from DDR4-2400 to DDR4-3200.

I expect the 4700U to be the common choice for midrange laptops at the same price as the previous 3700U. 30% lower CU count and 50% more bandwidth means that memory bottlenecks aren't the issue they were with Raven Ridge and Pinnacle Ridge.
 
Joined
Oct 12, 2005
Messages
682 (0.10/day)
But the 4800U is a premium variant, and that DOES come with LPDDR4X

If you're talking about the 4700U and below that are more commonly seen with DDR4-3200, then those are only 7CU variants. They've lost 30% of their CUs and gained 50% bandwidth by going from DDR4-2400 to DDR4-3200.

I expect the 4700U to be the common choice for midrange laptops at the same price as the previous 3700U. 30% lower CU count and 50% more bandwidth means that memory bottlenecks aren't the issue they were with Raven Ridge and Pinnacle Ridge.
Still have to see the laptop with LPDDR4X. it seem quite expensive for now that even with the 4900, OEM prefer DDR4-3200.

Also, Bandwidth is shared with the CPU and a 3700U is a 4 core/8 thread cpu where 4700U is a 8 core/8 Thread CPU. Each CPU core will do their memory request putting more constraint on the memory.

Also dual channel LPDDR4X 4266 would give you about 70 GB/s. It's getting there, but it's still far from what a Rx 560 or a 1050 have (around 110 GB/s like said). But it would get there. But since we are still sharing it with the CPU, we are still far from the bandwidth of dedicated GPU.
 
Joined
Feb 20, 2019
Messages
7,309 (3.86/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
we are still far from the bandwidth of dedicated GPU.
The RX560 has 16CU for its 112GB/s, so 8CU with 70GB/s isn't unreasonable, even shared with the CPU.

The extra cores on Renoir won't make a difference to the CPU's utilisation of the memory bandwidth either; An application will use the bandwidth it needs no matter how many cores it's running on.

Renoir has 50-78% more bandwidth than previous APUs, depending on which RAM is used.
The CPU will use less memory bandwidth than Zen and Zen+ APUs because Zen2 has better prediction and twice the L3 cache, requiring fewer calls to main memory.
Put those two things together and Renoir's effective bandwidth gains are anywhere from 55% to 100% better.

You can't seem to admit that memory bandwidth has improved significantly enough to make a difference no matter how much I spell it out for you. Why is that?
 
Last edited:
Joined
Oct 12, 2005
Messages
682 (0.10/day)
The RX560 has 16CU for its 112GB/s, so 8CU with 70GB/s isn't unreasonable, even shared with the CPU.

The extra cores on Renoir won't make a difference to the CPU's utilisation of the memory bandwidth either; An application will use the bandwidth it needs no matter how many cores it's running on.

Renoir has 50-78% more bandwidth than previous APUs, depending on which RAM is used.
The CPU will use less memory bandwidth than Zen and Zen+ APUs because Zen2 has better prediction and twice the L3 cache, requiring fewer calls to main memory.
Put those two things together and Renoir's effective bandwidth gains are anywhere from 55% to 100% better.

You can't seem to admit that memory bandwidth has improved significantly enough to make a difference no matter how much I spell it out for you. Why is that?

Why AMD spend money on making a platform that have 4 memory channel (Threadripper) if more core do not means more memory bandwidth. Also a Program running faster on a single core can increase the bandwidth requirement (or more memory bandwidth can increase single threaded performance). In a game, each thread will have their own memory access. If you want to run these thread faster you will have to move the data faster witch can mean using more memory bandwidth. I think you are confusing Memory bandwidth with memory usage.

The L3 cache on Ryzen is indeed larger than on previous CPU but it still fairly small regarding many application data set. Also it's a victim cache (it contain data ejected from L2 cache).

The Pre-fetcher aren't there to save bandwidth. They are there to lower latency. The data still need to be moved around. They don't make the data magically appear in the cache. The main purpose of a cache is not to save on bandwidth. It's to save on latency. This is where Zen 2 get the most benefits of his cache VS Zen 1. But Zen 2 still benefits greatly from more memory bandwidth.

It's true that the global bandwidth of the platform have increase (Could be better if LPDDR4X wasn't so expensive and more available) but the actual usable bandwidth used by APU.

Something you can think about: Let say what you say is true that CPU do not really need that much bandwidth. Then Why CPU manufacturer even bother putting 2 DDR4 Channel ? the cost of having to put another channel on a motherboard and in a socket is significant. If they could get away without it, they would do.

But i agree that if an APU had 4 memory channel, still the same number of core of a desktop mid range part (8 core / 16 thread), they could definitivelly use a bit more silicon estate. I could say that if it was cost effective on PC to do so, they could even add a GPU chiplet in the package.

The key there is Cost. APU aren't design to cost a fortune.
 
Last edited:
Joined
May 31, 2016
Messages
4,325 (1.50/day)
Location
Currently Norway
System Name Bro2
Processor Ryzen 5800X
Motherboard Gigabyte X570 Aorus Elite
Cooling Corsair h115i pro rgb
Memory 16GB G.Skill Flare X 3200 CL14 @3800Mhz CL16
Video Card(s) Powercolor 6900 XT Red Devil 1.1v@2400Mhz
Storage M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD / LG 27GN950
Case Fractal Design G
Audio Device(s) Realtec 5.1
Power Supply Seasonic 750W GOLD
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
You miss the point. Right now they do not have issue to put more CU in a chip. They have issue getting more bandwidth. If they had the bandwidth, it would make sense to put more CU. There is no point of putting more CU if they end up bandwidth starved.

It's true that the bandwidth increase with DDR4-3200 but the new SKU also have the double the cores. And a dual channel DDR4-3200 is just 50 GB/S where a Radeon Rx 560 or Geforce 1050 have about 112 GB Dedicated. So not only an APU struggle to get half the bandwidth, they have to share it with 6 or 8 core CPU.

They reduced the CU count because they knew they could increase the frequency and give aproximativelly the same performance that is anyway, limited by the available bandwidth.

We will see larger APU when the memory bandwidth will increase. But unless they use on chip memory, i doubt we will see very large chip. Also the market for large APU outside console is very niche and probably not worth the investment.
I know what you are saying I just disagree with your premise. It is not all about memory. I can bet, if AMD decided to put more CU in Renoir it would have been faster even if the memory stays as it is. Sure the memory plays a role here but it is not just the memory like you've said. I didn't miss the point, rather I share what i think. Still, it is more of a balance than just one element in the equation playing a major role.
Why AMD spend money on making a platform that have 4 memory channel (Threadripper) if more core do not means more memory bandwidth. Also a Program running faster on a single core can increase the bandwidth requirement (or more memory bandwidth can increase single threaded performance). In a game, each thread will have their own memory access. If you want to run these thread faster you will have to move the data faster witch can mean using more memory bandwidth. I think you are confusing Memory bandwidth with memory usage.
I don't think the memory and more channels are for speeding things up (well in some means yes). It would seem like you are hung upon graphics and memory combinations. TR's memory and channels are not to utilize the cores. The TR's are not for gaming with all those cores and performance these can be a Virtual machines like non other. This is where the memory channels comes into play by increasing the bandwidth preventing stalls. Graphics or games has nothing to do with that.
I got 2 TR's 3970x's. Bought not long ago and that's how I find them useful in a way.
 
Last edited:
Joined
Feb 20, 2019
Messages
7,309 (3.86/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
I know what you are saying I just disagree with your premise. It is not all about memory. I can bet, if AMD decided to put more CU in Renoir it would have been faster even if the memory stays as it is. Sure the memory plays a role here but it is not just the memory like you've said. I didn't miss the point, rather I share what i think. Still, it is more of a balance than just one element in the equation playing a major role.

Indeed. The Vega8 and Vega10 in the 2500U and 2700U perform differently. For that generation the DDR4-2400 and the 15W TDP were both bottlenecks, but even at 15W the 2700U is about 10% faster and that grows as you relax the TDP.

The 2000 U-series need around 22-23W to avoid throttling the Vega CUs in typical gaming/benchmarks. Thankfully both will boost at 25W for short periods and during boost the 2700U is almost 15% quicker than the 2500 on first run of benchmarks. That's still not the gain that 25% extra CUs should give, but we're still limited by DDR4-2400 so you have to expect imperfect scaling.

Edit:
I just found this video -
1 additional Vega CU adds around 5-15% depending on when you pause, but it's definitely around 7% faster on average with 2-3 more fps at most pauses.
I can only imagine how great a Renoir Vega10 would have been for ultrabook gaming....
 
Last edited:
Joined
Oct 12, 2005
Messages
682 (0.10/day)
16,6% increase in CU count(6 on Ryzen 5 4500u and 7 on Ryzen 7 4700U), 6,6% faster gpu frequency (1500MHz on 4500u vs 1600MHz) but only getting 7% more performance (not counting also the faster CPU there)

look to me there is a bottleneck somewhere :confused:

I wonder where it is
 
Joined
May 15, 2020
Messages
697 (0.48/day)
Location
France
System Name Home
Processor Ryzen 3600X
Motherboard MSI Tomahawk 450 MAX
Cooling Noctua NH-U14S
Memory 16GB Crucial Ballistix 3600 MHz DDR4 CAS 16
Video Card(s) MSI RX 5700XT EVOKE OC
Storage Samsung 970 PRO 512 GB
Display(s) ASUS VA326HR + MSI Optix G24C4
Case MSI - MAG Forge 100M
Power Supply Aerocool Lux RGB M 650W
T flops and performance do not scale linearly, never have, never will. That goes for any graphic architecture and even when memory bandwidth goes up.
 
Top