• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA GeForce RTX 4090 PCI-Express Scaling with Core i9-13900K

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
28,637 (3.74/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
We take a second look at PCI-Express performance scaling of NVIDIA's GeForce RTX 4090 flagship card. This time with a Core i9-13900K, which hopefully helps us highlight more differences than what we've seen with Ryzen 5800X last year. We've also added minimum FPS and ray tracing testing.

Show full review
 
Thanks for this, always useful.
Amusing to see that PCIe 2.0 x16 is still just about fine. You probably cannot pair a 4090 with anything that old - would the board even recognise the card?
 
while i know that it probably doesn't matter and there shouldn't be a difference between a simulated setup like yours aka limiting the lanes via bios vs having a real physical nvme ssd, i would like an additional "for science" test. Get a mobo with a Gen 5 M.2 slot, put a Gen 5 M.2 nvme in it and leave everything in bios related to pcie lanes to default. In the default auto mode which should split up the lanes, does GPU performance take a hit since a lane thats supposed to be "dedicated" to the GPU gets split up.
 
Great read.

@W1zzard a suggestion, include your emulator test suite in reviews like this. Emulating UMA consoles like the Switch or Wii U is very PCIe and memory bandwidth intensive, and having a 512-bit GDDR6X bus is irrelevant when the bottleneck is in the motherboard.
 
and there shouldn't be a difference between a simulated setup like yours aka limiting the lanes via bios vs having a real physical nvme ssd, i would like an additional "for science" test. Get a mobo with a Gen 5 M.2 slot, put a Gen 5 M.2 nvme in it and leave everything in bios related to pcie lanes to default
There isn't, I've tested it of course, also verified by taping off half the lanes so they are electrically not connected
 
The push for faster PCI-Express was never made for graphics cards, but the ever lasting bandwidth requirement enterprise platforms require in NIC's, storage and such. By making this universal such as AGP was, they dont have to develop a seperate lane for the graphics card but simply unify the whole thing. I think by now such tests can be burried as there's hardly any difference on such a high end card and the games that are tested.
 
Why not a Ryzen 7000-series CPU?
 
So there's a difference. Some titles not so much others definitely. RT adds to the difference more.

The minimums page is wild. Seeing the 7900XTX soaring in some charts and just being pounded in others. Also the 4080 having a better minimums chart than the 4090 in Control at 1080p. Any theories there?
 
Is it really accurate that a GPU will perform similar in a pcie x8 4.0 and pcie x16 3.0?
Yes.
Tested and proven in multiple scenarios by multiple different review sites over many generations of PCIe.
It is always identical, or within 0.5% which is often run-to-run variance between tests anyway.
 
The push for faster PCI-Express was never made for graphics cards, but the ever lasting bandwidth requirement enterprise platforms require in NIC's, storage and such. By making this universal such as AGP was, they dont have to develop a seperate lane for the graphics card but simply unify the whole thing. I think by now such tests can be burried as there's hardly any difference on such a high end card and the games that are tested.
There are use cases (and there may be more common in the future if CPUs and chipsets allow more flexible lane splitting) where you can't have all 16 lanes for a GPU, or all 4 for an M.2 slot. Fast PCIe will come in handy in such situations.
 
Is it really accurate that a GPU will perform similar in a pcie x8 4.0 and pcie x16 3.0?
This chart shows that it is in terms of throughput that it is the same.
pcie-3-vs-4-4.png
 
This chart shows that it is in terms of throughput that it is the same.
View attachment 286308
Yes i know they have the same throughput but i didnt knew if communicating through more "wires" has some benefit or not (or vice versa).. they are still just sending only 1s and 0s but yeah.
 
Thanks for all the benchmarking work. Looks like the short version is that PCIE4x16 isn't needed over PCIE3x16 yet, but running PCIE3x8 (or PCIE2x16) isn't a place you want to be much longer. Not that anyone should mate a PCIE2x16 CPU with this card in the first place.

Only way I can see running PCIE3x8 is on select few Z97 boards that gave an NVME PCIE3x4 by splitting the GPU lanes.
 
Yes.
Tested and proven in multiple scenarios by multiple different review sites over many generations of PCIe.
It is always identical, or within 0.5% which is often run-to-run variance between tests anyway.
Still worth testing, sometimes at least (and I'm not saying it's W1z who should test every possible permutation of all parameters).
For example, latencies may be different, and it may sometimes matter. A consistent 0.5% difference would point to that.
 
Still worth testing, sometimes at least (and I'm not saying it's W1z who should test every possible permutation of all parameters).
For example, latencies may be different, and it may sometimes matter. A consistent 0.5% difference would point to that.
Latency has improved going from gens 1>2, 3>4, and 4>5. Latency increased by a factor of more than 8 going from gen 2>3, so if it mattered even slightly, we'd have seen a difference then.

We didn't. Same results within the tiny margin of error that can be accounted for by run-to-run variance.
 
So basically we can have x8 GEN 5 GEN 6 GPUs in the future as they are cheaper to make? :)
 
This time with a Core i9-13900K, which hopefully helps us highlight more differences than what we've seen with Ryzen 5800X last year.
Awesome content! Thanks for spending time on this. It essentially shows that the current point of PCIe saturation for GPUs sits somewhere between 8 and 16 lanes of Gen4, closer to x8, which also means that Gen3 x16 is now finally saturated, just.

Minor correction in the text is needed:
"This is also the kind of bandwidth comparable to using eGPU boxes (which convert an 80 Gbps Thunderbolt or USB4 interface to a PCI-Express 4.0 x4 slot"
Thunderbolt 4 port uses PCIe 3.0 x4 link, so maximum 32 Gbps each way for PCIe data.
 
would the board even recognise the card?
On Sandy Bridge, maybe. It needs UEFI, so it'd depend on the board adopting the UEFI standard (which wasn't common back then on consumer boards).

Also would depend on the UEFI implementation being up to standard, and when talking about early consumer UEFI implementations... yeah, luck of the draw. It might be possible though.
 
4K does seem tougher to run in these games, I guess just use DLSS if you want max frames
 
Also the 4080 having a better minimums chart than the 4090 in Control at 1080p. Any theories there?
It is consistent, seems some overhead or frame pacing issue. I posted a frametime chart for this recently, check my post history
 
So basically we can have x8 GEN 5 GEN 6 GPUs in the future as they are cheaper to make? :)
Hah, no.
Those cost savings for the GPU manufacturer typically don't get passed on to us but the price hikes of Gen5 and Gen6 motherboards absolutely do.
 
Back
Top