PCI-e 3.0 x8 may not have enough bandwidth for RX 5500 XT

FordGT90Concept · Dec 24, 2019

Yes, AMD has a lot of them that ship with PCIe x8:

TechPowerUp

Graphics card and GPU database with specifications for products launched in recent years. Includes clocks, photos, and technical details.

www.techpowerup.com

HTC · Dec 24, 2019

FordGT90Concept said:
Yes, AMD has a lot of them that ship with PCIe x8:

TechPowerUp

Graphics card and GPU database with specifications for products launched in recent years. Includes clocks, photos, and technical details.

www.techpowerup.com

A couple of those should be tested in "an X570 environment" to see if they too behave the same way as 5500XT.

Depending on which slot in the board it's connected to, doesn't the board turn x16 into x8 when certain amount of devices are connected? IIRC, there are even instances where the boards turn x16 into x4.

There are scenarios where x16 capable GPUs are running @ x8 not because of a BIOS selection but because of the amount of devices hooked up to the system. With these in mind, such PCIe scaling tests ought to be performed, no? Isn't that the whole point of PCIe scaling tests to begin with?

R0H1T · Dec 24, 2019

Coming back to this, is there no way to change say the PCIe 4.0 x1 signal into PCIe 3.0 x2 on the fly? Would solve most if not all of these PCIe lane issues on high end boards & cards.

EarthDog · Dec 24, 2019

HTC said:
A couple of those should be tested in "an X570 environment" to see if they too behave the same way as 5500XT.

Depending on which slot in the board it's connected to, doesn't the board turn x16 into x8 when certain amount of devices are connected? IIRC, there are even instances where the boards turn x16 into x4.

There are scenarios where x16 capable GPUs are running @ x8 not because of a BIOS selection but because of the amount of devices hooked up to the system. With these in mind, such PCIe scaling tests ought to be performed, no? Isn't that the whole point of PCIe scaling tests to begin with?

Some boards drop to x8 in the GPU slot, yes. But why the reindeer games when simply changing it in the bios for testing is possible.....just like the testing did.

I dont think saturating the pcie bus by playing a game and say transferring a massive file between m.2 nvme drives is really common or worth it? Did I misunderstand?

R0H1T said:
Coming back to this, is there no way to change say the PCIe 4.0 x1 signal into PCIe 3.0 x2 on the fly? Would solve most if not all of these PCIe lane issues on high end boards & cards.

To what end? Are you talking on a 4.0 x1 slot??? What does that have to do with a gpu?

HTC · Dec 24, 2019

EarthDog said:
Some boards drop to x8 in the GPU slot, yes. But why the reindeer games when simply changing it in the bios for testing is possible.....just like the testing did.

To what end? Are you talking on a 4.0 x1 slot??? What does that have to do with a gpu?

Except it didn't: check the methodology in the 1st and 3rd links in post #95.

It was done with the 2080Ti but not the same way with the 5700XT.

Question: would a regular x16 capable card work @ x8 (in bandwidth) in a PCIe 4.0 x4 slot (due to amount of connected devices) or @ x4?

EarthDog · Dec 24, 2019

HTC said:
Except it didn't: check the methodology in the 1st and 3rd links in post #95.

It was done with the 2080Ti but not the same way with the 5700XT.

Question: would a regular x16 capable card work @ x8 (in bandwidth) in a PCIe 4.0 x4 slot (due to amount of connected devices) or @ x4?

I was talking about the original testing in the thread. My fault...

As far as your question, if the x16 physical is at x4 electrical, it should work if it is an amd card. Iirc, nvidia cards dont work under x8.... at least with sli.

R0H1T · Dec 24, 2019

EarthDog said:
To what end? Are you talking on a 4.0 x1 slot??? What does that have to do with a gpu?

No I'm talking about (electrically) converting PCIe 4.0 signal into 2x their equivalent PCIe 3.0 lanes on the fly, for instance on motherboards where the second (GPU) slot is limited to x8 or x4 as the first slot is occupied & this could be useful there. It would also be mightily handy especially for M.2 slots & other (expansion) cards, assuming it can be done in the first place.

Waldorf · Dec 24, 2019

there is no difference.
doing it with the board reducing the speed because you have another 2 devices, will be no different then switching bios to lower speed and/or gen.
but even if you switch on purpose to x4 (or because of multiple devices) wont matter much except for ppl owning a 2080ti or faster chip (quadro etc).
those are the only times were i can saturate 3.0 @x8/4.0 @x4.

slower cards will most of the time perform better (pcie 3.0 x8 vs x16) as you have less overhead, should be same with 4.0 @x4.

After using 3DMark11 (X setting) as bench to get some numbers, seems that with pcie 4.0 it won't matter much anymore compared to 3.0 (switching from x16 to x8),
as i get virtually identical numbers no matter what i used (3.0 @8/16 or 4.0 @x8/16), e.g below 0.1% variance (14171 vs 14183).

IceShroom · Dec 24, 2019

eidairaman1 said:
Ok so its a 560/560D

Baffin/Polaris11/Polaris21 has 8 PCI-e lane just like the Navi14.

HTC said:
A couple of those should be tested in "an X570 environment" to see if they too behave the same way as 5500XT.

Depending on which slot in the board it's connected to, doesn't the board turn x16 into x8 when certain amount of devices are connected? IIRC, there are even instances where the boards turn x16 into x4.

There are scenarios where x16 capable GPUs are running @ x8 not because of a BIOS selection but because of the amount of devices hooked up to the system. With these in mind, such PCIe scaling tests ought to be performed, no? Isn't that the whole point of PCIe scaling tests to begin with?

Well those card will be Shader limit before memory limit.

EarthDog · Dec 24, 2019

IceShroom said:
Well those card will be Shader limit before memory limit.

What does this mean? If you run out of memory bad things will happen period. Makes low fps lower.

IceShroom · Dec 24, 2019

EarthDog said:
What does this mean? If you run out of memory bad things will happen period. Makes low fps lower.

I mean those(except 5500XT) card will not have the power to process the texture that requers more than 4GB memory.

Waldorf · Dec 24, 2019

That would also depend on the game.
I can play something from 10y ago (wont cost much fps wise), but still use higher res like 1080p/1440p or even 4K, which requires more vram.

EarthDog · Dec 24, 2019

IceShroom said:
I mean those(except 5500XT) card will not have the power to process the texture that requers more than 4GB memory.

It would just be slower, as I said.

Flaky · Dec 25, 2019

HTC said:
Question: would a regular x16 capable card work @ x8 (in bandwidth) in a PCIe 4.0 x4 slot (due to amount of connected devices) or @ x4?

I totally don't get the "(in bandwidth)" part in question.

Slot has 4 lanes -> the resulting connection cannot have more than 4 lanes.

R0H1T said:
No I'm talking about (electrically) converting PCIe 4.0 signal into 2x their equivalent PCIe 3.0 lanes on the fly (...)

I don't recall any hardware (at least in consumer segment) capable of changing lane widths after the POST is done.
Doing such thing may require the PCIe controller to reinitialize - that would mean that all the devices connected to it would have to be disabled and temporairly soft-disconnected.
Let's not forget that reallocationg lanes somewhere else would require physically re-routing the traces - and that is a hardware cost.

Vya Domus · Dec 25, 2019

GoldenX said:
PCIe bandwidth IS important on VRAM limited scenarios.

Funny how literally no one had a clue about any of this up until a week ago but now it has become very important ? No, it hasn't really, it's a monumental waste of time to worry about PCIe bandwidth, the VRAM is what matters.

GoldenX said:
What do you use after VRAM is full? RAM.

This is improper, GPUs never use system RAM, it's simply too slow, we aren't talking a few FPS lost here and there, no, frames would take seconds to be processed.

The games simply swap memory in and out of the card's memory, it never gets directly accessed from across the PCIe lanes.

EarthDog · Dec 25, 2019

Vya Domus said:
Funny how literally no one had a clue about any of this up until a week ago but now it has become very important ? No, it hasn't really, it's a monumental waste of time to worry about PCIe bandwidth, the VRAM is what matters.

This is improper, GPUs never use system RAM, it's simply too slow, we aren't talking a few FPS lost here and there, no, frames would take seconds to be processed.

The games simply swap memory in and out of the card's memory, it never gets directly accessed from across the PCIe lanes.

of course vram is the primary issue. But we've seen testing that shows in vram limited situations the bus helps. If the data was transferring internally and not using pcie and ram, we wouldnt see these performance increases in vram limited situations.

If it isnt using system ram and the pcie bus in some capacity, why does the testing show notable increases when using the bus with more bandwidth (where it didnt before even with a 5700 xt)?

It's also worth noting that integrated gpus use system ram and AMD's does a decent job at 1080p gaming with it.

cucker tarlson · Dec 25, 2019

GoldenX said:
Inb4 people forgets the GT1030 suffers the same problem.

1030 is a turd for gaming.
it's a good multi monitor card though,and for that the pci-lanes don't matter,

GoldenX · Dec 25, 2019

Vya Domus said:
Funny how literally no one had a clue about any of this up until a week ago but now it has become very important ? No, it hasn't really, it's a monumental waste of time to worry about PCIe bandwidth, the VRAM is what matters.

This is improper, GPUs never use system RAM, it's simply too slow, we aren't talking a few FPS lost here and there, no, frames would take seconds to be processed.

The games simply swap memory in and out of the card's memory, it never gets directly accessed from across the PCIe lanes.

Chech what the shared GPU RAM is, you can look at it from the task manager or GPU-Z. Any GPU uses system RAM as if it were an IGP, as a swap. Always.
Users didn't know it, doesn't make it a lie, specially with provided evidence.

cucker tarlson · Dec 25, 2019

I would not like to get a $170 card and find out it's massively limited at 1080p while a $160 1650 super does just fine.
Not only is this card super late and badly priced,it's gonna give you unplayable experience at 1080p now,let alone in the future.
Like I was saying,there should be one 5500xt with 192-bit 6gb memory at $180 and it'd be a very strong contender to the lower Super lineup,but AMD proved themselves incompetent again.And that's not even touching their driver issues. :fear:

Vya Domus · Dec 25, 2019

GoldenX said:
Any GPU uses system RAM as if it were an IGP, as a swap. Always.

That memory is for swapping contents of the GPU. There is no lie, you cannot access memory contents of GPUs without transferring the data first, it's simply not possible. No API allows for this, you can't get something like a pointer to VRAM memory from the host side of things (CPU and RAM). Be it a DirectX shader or a CUDA compute kernel they all operate on buffers that have been transferred first into memory and never accessed across the system bus. That would be unimaginably slow.

Say you need to go through 1 GB of data 1000 times, you have two options : get that memory first into VRAM and access it 1000 times at VRAM latency or access it 1000 times at several times the magnitude of VRAM latency across the PCIe connection from system memory. Which option do you think it's faster ?

GoldenX said:
Any GPU uses system RAM as if it were an IGP, as a swap. Always.

Definitely not always, only when it's needed.

EarthDog said:
If it isnt using system ram and the pcie bus in some capacity

It's using RAM in the sense that that's where all the buffers are kept, the buffers however are never accessed directly by the GPU, they have to be in it's VRAM before that's possible.

EarthDog said:
(where it didnt before even with a 5700 xt)?

The 5700XT has more memory and it also has more PCIe lanes. Everything else works in the same way.

HTC · Dec 25, 2019

Flaky said:
I totally don't get the "(in bandwidth)" part in question.

Slot has 4 lanes -> the resulting connection cannot have more than 4 lanes.

I don't recall any hardware (at least in consumer segment) capable of changing lane widths after the POST is done.
Doing such thing may require the PCIe controller to reinitialize - that would mean that all the devices connected to it would have to be disabled and temporairly soft-disconnected.
Let's not forget that reallocationg lanes somewhere else would require physically re-routing the traces - and that is a hardware cost.

This was true with up to PCIe 3.0, but what about with PCIe 4.0 slots? Does a x4 card (electrically and or / for whatever other reason functioning @ x4) work @ x4 or @ x8 in a PCIe 4.0 slot?

This whole issue stems from the fact that x8 is working as x16 in a PCIe 4.0 slot: does the same apply to x4?

FordGT90Concept · Dec 25, 2019

cucker tarlson said:
I would not like to get a $170 card and find out it's massively limited at 1080p while a $160 1650 super does just fine.
Not only is this card super late and badly priced,it's gonna give you unplayable experience at 1080p now,let alone in the future.
Like I was saying,there should be one 5500xt with 192-bit 6gb memory at $180 and it'd be a very strong contender to the lower Super lineup,but AMD proved themselves incompetent again.And that's not even touching their driver issues.

*cough* 8 GiB

The question is thus: Why do GeForce cards perform so much better with 4 GiB compared to AMD cards? Maybe it is because of x16 versus x8 (faster RAM access).

cucker tarlson · Dec 25, 2019

FordGT90Concept said:
*cough* 8 GiB

5500xt 8gb sells at 1660 super price

FordGT90Concept · Dec 25, 2019

Because AMD is letting AIBs clear inventory of older stock (especially RX 590, RX 580, and RX 570). The 5500 cards (all of them) will come down in price after they're done.

GoldenX · Dec 25, 2019

Vya Domus said:
That memory is for swapping contents of the GPU. There is no lie, you cannot access memory contents of GPUs without transferring the data first, it's simply not possible. No API allows for this, you can't get something like a pointer to VRAM memory from the host side of things (CPU and RAM). Be it a DirectX shader or a CUDA compute kernel they all operate on buffers that have been transferred first into memory and never accessed across the system bus. That would be unimaginably slow.

Say you need to go through 1 GB of data 1000 times, you have two options : get that memory first into VRAM and access it 1000 times at VRAM latency or access it 1000 times at several times the magnitude of VRAM latency across the PCIe connection from system memory. Which option do you think it's faster ?

Definitely not always, only when it's needed.

It's using RAM in the sense that that's where all the buffers are kept, the buffers however are never accessed directly by the GPU, they have to be in it's VRAM before that's possible.

The 5700XT has more memory and it also has more PCIe lanes. Everything else works in the same way.

You just confirmed it yourself, after VRAM is full, PCIe is the next bottleneck.

System Name	BY-2021
Processor	AMD Ryzen 7 5800X (65w eco profile)
Motherboard	MSI B550 Gaming Plus
Cooling	Scythe Mugen (rev 5)
Memory	2 x Kingston HyperX DDR4-3200 32 GiB
Video Card(s)	AMD Radeon RX 7900 XT
Storage	Samsung 980 Pro, Seagate Exos X20 TB 7200 RPM
Display(s)	Nixeus NX-EDG274K (3840x2160@144 DP) + Samsung SyncMaster 906BW (1440x900@60 HDMI-DVI)
Case	Coolermaster HAF 932 w/ USB 3.0 5.25" bay + USB 3.2 (A+C) 3.5" bay
Audio Device(s)	Realtek ALC1150, Micca OriGen+
Power Supply	Enermax Platimax 850w
Mouse	Nixeus REVEL-X
Keyboard	Tesoro Excalibur
Software	Windows 10 Home 64-bit
Benchmark Scores	Faster than the tortoise; slower than the hare.

System Name	HTC's System
Processor	Ryzen 5 5800X3D
Motherboard	Asrock Taichi X370
Cooling	NH-C14, with the AM4 mounting kit
Memory	G.Skill Kit 16GB DDR4 F4 - 3200 C16D - 16 GTZB
Video Card(s)	Sapphire Pulse 6600 8 GB
Storage	1 Samsung NVMe 960 EVO 250 GB + 1 3.5" Seagate IronWolf Pro 6TB 7200RPM 256MB SATA III
Display(s)	LG 27UD58
Case	Fractal Design Define R6 USB-C
Audio Device(s)	Onboard
Power Supply	Corsair TX 850M 80+ Gold
Mouse	Razer Deathadder Elite
Software	Ubuntu 20.04.6 LTS

System Name	HTC's System
Processor	Ryzen 5 5800X3D
Motherboard	Asrock Taichi X370
Cooling	NH-C14, with the AM4 mounting kit
Memory	G.Skill Kit 16GB DDR4 F4 - 3200 C16D - 16 GTZB
Video Card(s)	Sapphire Pulse 6600 8 GB
Storage	1 Samsung NVMe 960 EVO 250 GB + 1 3.5" Seagate IronWolf Pro 6TB 7200RPM 256MB SATA III
Display(s)	LG 27UD58
Case	Fractal Design Define R6 USB-C
Audio Device(s)	Onboard
Power Supply	Corsair TX 850M 80+ Gold
Mouse	Razer Deathadder Elite
Software	Ubuntu 20.04.6 LTS

System Name	Main
Processor	R7 5950x
Motherboard	MSI x570S Unify-X Max
Cooling	converted Eisbär 280, two F14 + three F12S intake, two P14S + two P14 + two F14 as exhaust
Memory	16 GB Corsair LPX bdie @3600/16 1.35v
Video Card(s)	GB 2080S WaterForce WB
Storage	six M.2 pcie gen 4
Display(s)	Sony 50X90J
Case	Tt Level 20 HT
Audio Device(s)	Asus Xonar AE, modded Sennheiser HD 558, Klipsch 2.1 THX
Power Supply	Corsair RMx 750w
Mouse	Logitech G903
Keyboard	GSKILL Ripjaws
VR HMD	NA
Software	win 10 pro x64
Benchmark Scores	TimeSpy score Fire Strike Ultra SuperPosition CB20

Processor	Intel Core i5 4590
Motherboard	Gigabyte Z97x Gaming 3
Cooling	Intel Stock Cooler
Memory	8GiB(2x4GiB) DDR3-1600 [800MHz]
Video Card(s)	XFX RX 560D 4GiB
Storage	Transcend SSD370S 128GB; Toshiba DT01ACA100 1TB HDD
Display(s)	Samsung S20D300 20" 768p TN
Case	Cooler Master MasterBox E501L
Audio Device(s)	Realtek ALC1150
Power Supply	Corsair VS450
Mouse	A4Tech N-70FX
Software	Windows 10 Pro
Benchmark Scores	BaseMark GPU : 250 Point in HD 4600

PCI-e 3.0 x8 may not have enough bandwidth for RX 5500 XT

FordGT90Concept

"I go fast!1!11!1!"

TechPowerUp

HTC

TechPowerUp

R0H1T

EarthDog

HTC

EarthDog

R0H1T

Waldorf

IceShroom

EarthDog

IceShroom

Waldorf

EarthDog

Flaky

Vya Domus

EarthDog

cucker tarlson

GoldenX

cucker tarlson

Vya Domus

HTC

FordGT90Concept

"I go fast!1!11!1!"

cucker tarlson

FordGT90Concept

"I go fast!1!11!1!"

GoldenX

System Name	Good enough
Processor	AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard	ASRock B650 Pro RS
Cooling	2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory	32GB - FURY Beast RGB 5600 Mhz
Video Card(s)	Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage	1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s)	LG UltraGear 32GN650-B + 4K Samsung TV
Case	Phanteks NV7
Power Supply	GPS-750C

System Name	Purple rain
Processor	10.5 thousand 4.2G 1.1v
Motherboard	Zee 490 Aorus Elite
Cooling	Noctua D15S
Memory	16GB 4133 CL16-16-16-31 Viper Steel
Video Card(s)	RTX 2070 Super Gaming X Trio
Storage	SU900 128,8200Pro 1TB,850 Pro 512+256+256,860 Evo 500,XPG950 480, Skyhawk 2TB
Display(s)	Acer XB241YU+Dell S2716DG
Case	P600S Silent w. Alpenfohn wing boost 3 ARGBT+ fans
Audio Device(s)	K612 Pro w. FiiO E10k DAC,W830BT wireless
Power Supply	Superflower Leadex Gold 850W
Mouse	G903 lightspeed+powerplay,G403 wireless + Steelseries DeX + Roccat rest
Keyboard	HyperX Alloy SilverSpeed (w.HyperX wrist rest),Razer Deathstalker
Software	Windows 10
Benchmark Scores	A LOT

System Name	Ciel / Akane
Processor	AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard	Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling	ID-Cooling 224-XT Basic / Stock
Memory	2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s)	Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage	NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s)	AOC Q27G3XMN / Samsung S22F350
Case	Cougar MX410 Mesh-G / Generic
Audio Device(s)	Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply	Aerocool KCAS-500W / Gigabyte P450B
Mouse	EVGA X15 / Logitech G203
Keyboard	VSG Alnilam / Dell
Software	Windows 11