• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD's Ryzen Cache Analyzed - Improvements; Improveable; CCX Compromises

Joined
Sep 15, 2007
Messages
3,857 (0.78/day)
Location
Police/Nanny State of America
System Name More hardware than I use :|
Processor 4.7 8350 - 4.2 4560K - 4.4 4690K
Motherboard Sabertooth R2.0 - Gigabyte Z87X-UD4H-CF - AsRock Z97M KIller
Cooling Mugen 2 rev B push/pull - Hyper 212+ push/pull - Hyper 212+
Memory 16GB Gskill - 8GB Gskill - 16GB Ballistix 1.35v
Video Card(s) Xfire OCed 7950s - Powercolor 290x - Oced Zotac 980Ti AMP! (also have two 7870s)
Storage Crucial 250GB SSD, Kingston 3K 120GB, Sammy 1TB, various WDs, 13TB (actual capactity) NAS with WDs
Display(s) X-star 27" 1440 - Auria 27" 1440 - BenQ 24" 1080 - Acer 23" 1080
Case Lian Li open bench - Fractal Design ARC - Thermaltake Cube (still have HAF 932 and more ARCs)
Audio Device(s) Titanium HD - Onkyo HT-RC360 Receiver - BIC America custom 5.1 set up (and extra Klipsch sub)
Power Supply Corsair 850W V2 - EVGA 1000 G2 - Seasonic 500 and 600W units (dead 750W needs RMA lol)
Mouse Logitech G5 - Sentey Revolution Pro - Sentey Lumenata Pro - multiple wireless logitechs
Keyboard Logitech G11s - Thermaltake Challenger
Software I wish I could kill myself instead of using windows (OSX can suck it too).
I am going to wait to see Ryzen 5 in action. We have not concrete information about those chips and how they OC. Overclocking 8 cores is a different animal than overclocking 4 cores...historically at least. And I don't think the limit in OC is entirely the architecture, but we will find out.

I assume the arch is fine. It's the LPP process that wasn't intended for such clocks.
 
Joined
Dec 21, 2015
Messages
41 (0.02/day)
AIDA64 Build 5.80.4089 shows much better reults for Ryzen now.
https://forums.aida64.com/topic/3768-aida64-compatibility-with-amd-ryzen-processors/

 
Joined
Jan 17, 2006
Messages
916 (0.16/day)
Location
Ireland
System Name Cubed
Processor R7 1800X
Motherboard ASRock X370 Gaming Pro
Cooling Custom 3 radiator loop for CPU and GPUs, D5 pump
Memory 32GB (4 x 8GB) Team 4000MHz @ 3200MHz
Video Card(s) 2 x Vega 64 with single slot full cover blocks
Storage Samsung 970 Evo 512GB NVMe
Display(s) BenQ BL3200
Case Corsair Carbide 540
Audio Device(s) On board
Power Supply Corsair HX1200i
Mouse Roccat Leadr
Keyboard K95 RGB
Software Windows 10 Pro x64
Benchmark Scores #1 worldwide on 3D Mark 99 or 2000 or one of those, at one point back in the day. :)
Is this some beta d/l as my AIDA is not offering an update from 5.08.40 at the moment? Thanks!

And "2 x Octal core" doesn't seem right. When a 4790k says "quadcore" so maybe it needs some more fixing. :)
 
Last edited:
Joined
Jan 17, 2006
Messages
916 (0.16/day)
Location
Ireland
System Name Cubed
Processor R7 1800X
Motherboard ASRock X370 Gaming Pro
Cooling Custom 3 radiator loop for CPU and GPUs, D5 pump
Memory 32GB (4 x 8GB) Team 4000MHz @ 3200MHz
Video Card(s) 2 x Vega 64 with single slot full cover blocks
Storage Samsung 970 Evo 512GB NVMe
Display(s) BenQ BL3200
Case Corsair Carbide 540
Audio Device(s) On board
Power Supply Corsair HX1200i
Mouse Roccat Leadr
Keyboard K95 RGB
Software Windows 10 Pro x64
Benchmark Scores #1 worldwide on 3D Mark 99 or 2000 or one of those, at one point back in the day. :)
That's a bit cheeky to not put it out for Extreme!

Thanks Dave.
 
Joined
Jan 17, 2006
Messages
916 (0.16/day)
Location
Ireland
System Name Cubed
Processor R7 1800X
Motherboard ASRock X370 Gaming Pro
Cooling Custom 3 radiator loop for CPU and GPUs, D5 pump
Memory 32GB (4 x 8GB) Team 4000MHz @ 3200MHz
Video Card(s) 2 x Vega 64 with single slot full cover blocks
Storage Samsung 970 Evo 512GB NVMe
Display(s) BenQ BL3200
Case Corsair Carbide 540
Audio Device(s) On board
Power Supply Corsair HX1200i
Mouse Roccat Leadr
Keyboard K95 RGB
Software Windows 10 Pro x64
Benchmark Scores #1 worldwide on 3D Mark 99 or 2000 or one of those, at one point back in the day. :)
The issue is I'm perfectly up for beta testing (and do some in other areas too), maybe they should add an opt-in for that.

In other news, my Asrock X370 (pro gaming) ships today.
 

Enlightnd

New Member
Joined
Mar 17, 2017
Messages
2 (0.00/day)
Question, I've read here and in other places that part of the CCX bus congestion issue for games is that PCIe data is also shoved over the CCX bus.

Has anyone done any tests to see if the issue is greater for GPU's on the chipset PCIe lanes vs GPU's on the CPU embedded PCIe lanes?

(EDIT: Fix CPU lanes with PCIe lanes)
 
Last edited:
Joined
Sep 2, 2011
Messages
1,019 (0.29/day)
Location
Porto
System Name No name / Purple Haze
Processor Phenom II 1100T @ 3.8Ghz / Pentium 4 3.4 EE Gallatin @ 3.825Ghz
Motherboard MSI 970 Gaming/ Abit IC7-MAX3
Cooling CM Hyper 212X / Scythe Andy Samurai Master (CPU) - Modded Ati Silencer 5 rev. 2 (GPU)
Memory 8GB GEIL GB38GB2133C10ADC + 8GB G.Skill F3-14900CL9-4GBXL / 2x1GB Crucial Ballistix Tracer PC4000
Video Card(s) Asus R9 Fury X Strix (4096 SP's/1050 Mhz)/ PowerColor X850XT PE @ (600/1230) AGP + (HD3850 AGP)
Storage Samsung 250 GB / WD Caviar 160GB
Display(s) Benq XL2411T
Audio Device(s) motherboard / Creative Sound Blaster X-Fi XtremeGamer Fatal1ty Pro + Front panel
Power Supply Tagan BZ 900W / Corsair HX620w
Mouse Zowie AM
Keyboard Qpad MK-50
Software Windows 7 Pro 64Bit / Windows XP
Benchmark Scores 64CU Fury: http://www.3dmark.com/fs/11269229 / X850XT PE http://www.3dmark.com/3dm05/5532432
Question, I've read here and in other places that part of the CCX bus congestion issue for games is that PCIe data is also shoved over the CCX bus.

Has anyone done any tests to see if the issue is greater for GPU's on the chipset PCIe lanes vs GPU's on the CPU embedded CPU lanes?

That would be a cool thing to test!
 
Joined
Mar 17, 2017
Messages
4 (0.00/day)
this is windows load balancing working like it id on nehalems and first gen skylakes

basicly windows treats ryzen as a massive 16 core cpu instead of 8c 16t

The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.
 
Joined
Sep 2, 2011
Messages
1,019 (0.29/day)
Location
Porto
System Name No name / Purple Haze
Processor Phenom II 1100T @ 3.8Ghz / Pentium 4 3.4 EE Gallatin @ 3.825Ghz
Motherboard MSI 970 Gaming/ Abit IC7-MAX3
Cooling CM Hyper 212X / Scythe Andy Samurai Master (CPU) - Modded Ati Silencer 5 rev. 2 (GPU)
Memory 8GB GEIL GB38GB2133C10ADC + 8GB G.Skill F3-14900CL9-4GBXL / 2x1GB Crucial Ballistix Tracer PC4000
Video Card(s) Asus R9 Fury X Strix (4096 SP's/1050 Mhz)/ PowerColor X850XT PE @ (600/1230) AGP + (HD3850 AGP)
Storage Samsung 250 GB / WD Caviar 160GB
Display(s) Benq XL2411T
Audio Device(s) motherboard / Creative Sound Blaster X-Fi XtremeGamer Fatal1ty Pro + Front panel
Power Supply Tagan BZ 900W / Corsair HX620w
Mouse Zowie AM
Keyboard Qpad MK-50
Software Windows 7 Pro 64Bit / Windows XP
Benchmark Scores 64CU Fury: http://www.3dmark.com/fs/11269229 / X850XT PE http://www.3dmark.com/3dm05/5532432
The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.

I think NUMA would require a separate memory controller for each CCX, which is shared between ccx's on ryzen. But yeah, somewhat of an hybrid thing would be the real deal. For now lets hope that 4000MHz memory support gets there...
 
Joined
Mar 20, 2017
Messages
13 (0.01/day)
Question, I've read here and in other places that part of the CCX bus congestion issue for games is that PCIe data is also shoved over the CCX bus.

Has anyone done any tests to see if the issue is greater for GPU's on the chipset PCIe lanes vs GPU's on the CPU embedded PCIe lanes?

(EDIT: Fix CPU lanes with PCIe lanes)

GPUs can only use the lanes on the Ryzen CPUs, they don't connect to the Southbridge. So 16x or 8x/8x, off the CPU.

The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.

I'm wondering if the higher speed of copy operations on the L3 was specifically tweaked to speed up copies between the two L3s, allowing both CCXs to work from the same data after copying things over, if that would even help... but looks like the new version of AIDA makes this whole CCX intercommunication "bug" a non-issue.

Naples has a ton of PCIe lanes connecting two sockets together on dual socket configs. Somewhere at AMD there must have been people who worked on intercommunication between the 2 CCXs. I don't buy the theory that AMD simply dropped the ball and put out a chip with a glaring architectural flaw. If there are limitations of Ryzen I expect to find compromises that were made after intense discussion. Although they don't have a foundry, they do have the ability to do limited production in house for testing and research purposes. It really feels like people are way underestimating AMD and the quality of their product.
 
Last edited by a moderator:

Enlightnd

New Member
Joined
Mar 17, 2017
Messages
2 (0.00/day)
I wonder if that is accurate (about the PCIe lanes). I'm in conversation on IRC with several people using pass-trough (for virtualization) and they are explicitly speaking about the issues they have between GPU's on the CPU based bus and ones on a chipset hosted PCIe slot. Seems some boards have crappy IOMMU groupings causing weirdness with GPUs.
 
Joined
Mar 20, 2017
Messages
13 (0.01/day)
I wonder if that is accurate (about the PCIe lanes). I'm in conversation on IRC with several people using pass-trough (for virtualization) and they are explicitly speaking about the issues they have between GPU's on the CPU based bus and ones on a chipset hosted PCIe slot. Seems some boards have crappy IOMMU groupings causing weirdness with GPUs.

edit: Sorry, I didn't read your post carefully enough. I'll leave the pic up though, maybe someone will find it useful. But, yea, I have no idea what those guys on IRC are talking about. Aren't they mistaken in thinking that one of their GPUs is running off the chipset?



Taken from
https://rog.asus.com/articles/techn...platform-and-its-x370-b350-and-a320-chipsets/
 
Joined
Mar 23, 2005
Messages
3,937 (0.67/day)
Location
Ancient Greece, Acropolis (Time Lord)
System Name RiseZEN Gaming PC
Processor AMD Ryzen 7 1700X @ stock - (Ryzen 7 5700X - When AMD)
Motherboard ASRock Fatal1ty X370 GAMING X AM4 (ROG Crosshair VIII? Can I Dream - LOL)
Cooling Corsair H115i PRO RGB, 280mm Radiator, Dual 140mm ML Series PWM Fans
Memory G.Skill TridentZ 32GB (2 x 16GB) DDR4 3200 (Maybe get another 2 for 64GB Total)
Video Card(s) Sapphire Radeon RX 580 8GB Nitro+ SE + (Radeon 6700XT | When the $ is Right)
Storage Corsair Force MP500 480GB M.2 (OS) + Force MP510 480GB M.2 (Steam/Games)
Display(s) Asus 27" (MG278Q) 144Hz WQHD 1440p + 1 x Asus 24" (VG245H) FHD 75Hz 1080p
Case Corsair Obsidian Series 450D Gaming Case
Audio Device(s) SteelSeries 5Hv2 w/ ASUS Xonar DGX PCI-E GX2.5 Audio Engine Sound Card
Power Supply Corsair TX750W Power Supply
Mouse Razer DeathAdder PC Gaming Mouse - Ergonomic Left Hand Edition
Keyboard Logitech G15 Classic Gaming Keyboard
Software Windows 10 Pro - 64-Bit Edition
Benchmark Scores I'm the Doctor, Doctor Who. The Definition of Gaming is PC Gaming...
AMD will tighten up this L3 Latence. It will get better and better.
 
Joined
Jan 17, 2006
Messages
916 (0.16/day)
Location
Ireland
System Name Cubed
Processor R7 1800X
Motherboard ASRock X370 Gaming Pro
Cooling Custom 3 radiator loop for CPU and GPUs, D5 pump
Memory 32GB (4 x 8GB) Team 4000MHz @ 3200MHz
Video Card(s) 2 x Vega 64 with single slot full cover blocks
Storage Samsung 970 Evo 512GB NVMe
Display(s) BenQ BL3200
Case Corsair Carbide 540
Audio Device(s) On board
Power Supply Corsair HX1200i
Mouse Roccat Leadr
Keyboard K95 RGB
Software Windows 10 Pro x64
Benchmark Scores #1 worldwide on 3D Mark 99 or 2000 or one of those, at one point back in the day. :)
edit: Sorry, I didn't read your post carefully enough. I'll leave the pic up though, maybe someone will find it useful. But, yea, I have no idea what those guys on IRC are talking about. Aren't they mistaken in thinking that one of their GPUs is running off the chipset?

No, they are not mistaken, you could for example have 3 GPUs in there.

2 from the CPU and one from the chipset (with the associated latency).

In fact what a lot of the folks using VM want to do is have all 3 cards in separate I/O groups so you can e.g. have one card for your host O/S and the others each dedicated to a VM.
If the groups/UEFI are right, you could have a slower card off the chipset and have that as the host OSes' card (boot graphics) and then two powerful cards connected to the VMs or whatever.
 
Joined
Sep 26, 2006
Messages
240 (0.05/day)
The day there is 4GHz ram, 4GHz chip and a nice high capacity (64GB sounds nice) I will be throwing cash at AMD.
 
Joined
Mar 23, 2005
Messages
3,937 (0.67/day)
Location
Ancient Greece, Acropolis (Time Lord)
System Name RiseZEN Gaming PC
Processor AMD Ryzen 7 1700X @ stock - (Ryzen 7 5700X - When AMD)
Motherboard ASRock Fatal1ty X370 GAMING X AM4 (ROG Crosshair VIII? Can I Dream - LOL)
Cooling Corsair H115i PRO RGB, 280mm Radiator, Dual 140mm ML Series PWM Fans
Memory G.Skill TridentZ 32GB (2 x 16GB) DDR4 3200 (Maybe get another 2 for 64GB Total)
Video Card(s) Sapphire Radeon RX 580 8GB Nitro+ SE + (Radeon 6700XT | When the $ is Right)
Storage Corsair Force MP500 480GB M.2 (OS) + Force MP510 480GB M.2 (Steam/Games)
Display(s) Asus 27" (MG278Q) 144Hz WQHD 1440p + 1 x Asus 24" (VG245H) FHD 75Hz 1080p
Case Corsair Obsidian Series 450D Gaming Case
Audio Device(s) SteelSeries 5Hv2 w/ ASUS Xonar DGX PCI-E GX2.5 Audio Engine Sound Card
Power Supply Corsair TX750W Power Supply
Mouse Razer DeathAdder PC Gaming Mouse - Ergonomic Left Hand Edition
Keyboard Logitech G15 Classic Gaming Keyboard
Software Windows 10 Pro - 64-Bit Edition
Benchmark Scores I'm the Doctor, Doctor Who. The Definition of Gaming is PC Gaming...
The day there is 4GHz ram, 4GHz chip and a nice high capacity (64GB sounds nice) I will be throwing cash at AMD.
Seeing how Ram Speed makes a huge performance difference in Ryzen, yes Agreed.
 
Joined
Sep 26, 2014
Messages
58 (0.02/day)
Location
sydney australia
The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.

What an interesting suggestion.

Your paradigm of splitting, for coding purposes, the 8 cores into discrete 4 core ccxS & 8MB L3 cache blocks. & then minimising interaction between them, could speed some apps considerably.

I am a newb~, but i mused similarly in the context of a poor mans vega pro ssg (a 16GB $5000+ Vega w/ an onboard 4x 960 pro raid array).

if you install an Affordable 8 lane vega and an 8 lane 2x nvme adapter, so both link to the same 16 lane ccx (as a 16 lane card does e.g.) , then the gpu and the 2x nvme raid array may be able to talk very directly, and ~share the same 8MB cpu L3 cache. It doesnt bypass the shared pcie bus like Vega SSG, but it could be minimal latency, and enhanced by specialised large block size formatting for; swapping, workspace, temp files and graphics.

Vega 56/64 of course, have a dedicated HBCC subsystem for such gpu cache extension using nvme arrays. Done right, it promises a pretty good illusion of ~unlimited gpu memory/address space. Cool indeed.

As you see, a belated post from me. We now have evidence in the perf figures of single ccx zen/vega apuS. Yes, inter ccx interconnects have dragged Ryzen ~IPC down.
 
Joined
Sep 15, 2007
Messages
3,857 (0.78/day)
Location
Police/Nanny State of America
System Name More hardware than I use :|
Processor 4.7 8350 - 4.2 4560K - 4.4 4690K
Motherboard Sabertooth R2.0 - Gigabyte Z87X-UD4H-CF - AsRock Z97M KIller
Cooling Mugen 2 rev B push/pull - Hyper 212+ push/pull - Hyper 212+
Memory 16GB Gskill - 8GB Gskill - 16GB Ballistix 1.35v
Video Card(s) Xfire OCed 7950s - Powercolor 290x - Oced Zotac 980Ti AMP! (also have two 7870s)
Storage Crucial 250GB SSD, Kingston 3K 120GB, Sammy 1TB, various WDs, 13TB (actual capactity) NAS with WDs
Display(s) X-star 27" 1440 - Auria 27" 1440 - BenQ 24" 1080 - Acer 23" 1080
Case Lian Li open bench - Fractal Design ARC - Thermaltake Cube (still have HAF 932 and more ARCs)
Audio Device(s) Titanium HD - Onkyo HT-RC360 Receiver - BIC America custom 5.1 set up (and extra Klipsch sub)
Power Supply Corsair 850W V2 - EVGA 1000 G2 - Seasonic 500 and 600W units (dead 750W needs RMA lol)
Mouse Logitech G5 - Sentey Revolution Pro - Sentey Lumenata Pro - multiple wireless logitechs
Keyboard Logitech G11s - Thermaltake Challenger
Software I wish I could kill myself instead of using windows (OSX can suck it too).
The cache design of Ryzen 7 suggests that an even better way to handle it would be to schedule it as a two socket system, each of which is a 4c 8t CPU. The L3 cache is divided into two parts, and performance is much worse if a core on side A needs data from side B or vice versa.

Devs probably won't have a choice. It's only a matter of time before intel announces their copy of Ryzen.
 
Top