• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Raster/ray traced performance

Joined
Jun 12, 2022
Messages
64 (0.09/day)
I don't understand why GPU vendors don't treat raster and ray tracing performance the same way. For example, if GPU is bound with frame number at 1080p, the same frame number would have with raytracing ON.
So it would have the exact number of ray traced/shader cores or any kind of cores dedicated to ray tracing as main ones.

My point is with lower end side of GPUs, you would get ray tracing performance at that scale as it is raster.
 
Joined
Jun 21, 2021
Messages
2,640 (2.56/day)
System Name daily driver Mac mini M2 Pro
Processor Apple Silicon M2 Pro (6 p-cores, 4 e-cores)
Motherboard Apple proprietary
Cooling Apple proprietary
Memory Apple proprietary 16GB LPDDR5 unified memory
Video Card(s) Apple Silicon M2 Pro (16-core GPU)
Storage Apple proprietary 512GB SSD + various external HDDs
Display(s) LG 27UL850W (4K@60Hz IPS)
Case Apple proprietary
Audio Device(s) Apple proprietary
Power Supply Apple proprietary
Mouse Apple Magic Trackpad 2
Keyboard Keychron K1 tenkeyless (Gateron Reds)
Software macOS Ventura 13 (including latest patches)
Benchmark Scores (My Windows daily driver is a Beelink Mini S12. I'm not interested in benchmarking.)
GPU manufacturers design chips to function in a variety of usage cases. They don't design them for one person's usage case. It's not your mother in the kitchen cooking you your favorite breakfast.

All of these things are a balance of features, compromises, etc. Remember that it's a finite amount of silicon wafer space and they need to consider how frequently any given type of transistor is going to be used -- raster cores, ray tracing cores, machine learning cores, etc. -- in a wide variety of real world situations, not just one random guy living in his mother's basement running benchmarks.

If running applications that benefits from the presence of ray tracing cores becomes more popular I'm guessing that these GPU manufacturers will include more of them. There's not much incentive for them to be included if most of the time they go unused. That was AMD's philosophy until RDNA2 and unsurprisingly ray tracing performance on the RDNA2 generation GPUs is inferior to NVIDIA's Ampere generation GPUs.

It's not like Jensen or Dr. Su can wave a magic wand and add 5x RT cores to a die for free.

Adding more ray tracing cores to a given GPU die means subtracting other transistors elsewhere. That might mean lesser raster performance in exchange for better ray tracing performance. Is that something you'd be interested in?
 
Joined
Jun 12, 2022
Messages
64 (0.09/day)
What about new GPU architecture doctrine where raster and ray traced cores are modulary driven, designers could add more raytracing cores without affecting raster performance.
 
Joined
Jun 28, 2022
Messages
355 (0.54/day)
System Name EA-ZEN
Processor AMD Ryzen 7 5800X3D with -50mW UV
Motherboard Asus X570
Cooling Big Air
Memory 2x16 GB DDR4 3600 CL16
Video Card(s) Asus RTX 2080 Ti Strix highly OC’ed
Storage 1 TB NVME, 500 GB SSD etc
Display(s) 2x 27”, main: curved 144Hz SVA with BLS and HDR
Case Full Tower
Audio Device(s) Z906 5.1 and Audeze Headphones, Shure SM7B mic
Power Supply Enough
Mouse Old but gold
Keyboard Mechanical Cherry Brown
Software Windows 10 Pro
Benchmark Scores A lot
That’s way too soon, first we need efficient code to deal with RT in games and software first then all GPUs can transition to full or more RT hardware and you can get what you want. Essentially the long time goal is full RT but we’re still far away from it.
 
Joined
Sep 17, 2014
Messages
20,898 (5.97/day)
Location
The Washing Machine
Processor i7 8700k 4.6Ghz @ 1.24V
Motherboard AsRock Fatal1ty K6 Z370
Cooling beQuiet! Dark Rock Pro 3
Memory 16GB Corsair Vengeance LPX 3200/C16
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Samsung 850 EVO 1TB + Samsung 830 256GB + Crucial BX100 250GB + Toshiba 1TB HDD
Display(s) Gigabyte G34QWC (3440x1440)
Case Fractal Design Define R5
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse XTRFY M42
Keyboard Lenovo Thinkpad Trackpoint II
Software W10 x64
What about new GPU architecture doctrine where raster and ray traced cores are modulary driven, designers could add more raytracing cores without affecting raster performance.
They're radically different kinds of calculations.

If you look at the papers on Ampere and Turing you will see that INT (integer) functions were added and the new cores carry those functions, because they're built to handle them faster. Efficiency in GPUs is often obtained by reducing the functionality of cores, making them more single-purpose. Example of that is how Nvidia don't offer high precision floating point on Geforce, and did on Titan, but eventually killed that too.

But in the history of GPU development there is always a shift back and forth, as new functionality is added, and refined, and at some point becomes a known quantity. There's always a phase where the resources a GPU gets are not perfectly aligned with what games want. That's why there are differences in performance between engines/GPU families/games.

Pascal to Turing: less performance per shader, lower perf per clock, too, AND lower clocks, but the functionality was expanded. Ampere iterated on that with a further refinement of shader count, and the sacrifice was made in TDP to remain competitive in raster performance.

Either way, yes, I do agree the ideal situation is one where you don't waste die space on cores that are going to be idle at any point in time. We have yet to see if that is feasible.
 
Joined
Jun 28, 2022
Messages
355 (0.54/day)
System Name EA-ZEN
Processor AMD Ryzen 7 5800X3D with -50mW UV
Motherboard Asus X570
Cooling Big Air
Memory 2x16 GB DDR4 3600 CL16
Video Card(s) Asus RTX 2080 Ti Strix highly OC’ed
Storage 1 TB NVME, 500 GB SSD etc
Display(s) 2x 27”, main: curved 144Hz SVA with BLS and HDR
Case Full Tower
Audio Device(s) Z906 5.1 and Audeze Headphones, Shure SM7B mic
Power Supply Enough
Mouse Old but gold
Keyboard Mechanical Cherry Brown
Software Windows 10 Pro
Benchmark Scores A lot
Pascal to Turing: less performance per shader, lower perf per clock, too, AND lower clocks, but the functionality was expanded. Ampere iterated on that with a further refinement of shader count, and the sacrifice was made in TDP to remain competitive in raster performance.
I’m curious how you would come to conclusion that Nvidia made the shaders worse with a new architecture? Turing has higher IPC than Pascal in every way, and the clocks are same more or less. The only GPU with straight lower clocks is the 2080 Ti cause it was a huge GPU (still the biggest gaming GPU ever) and had to make due with a “low” tdp of just 280W compared to its size.
 
Joined
Jun 12, 2022
Messages
64 (0.09/day)
Is there a future of GPU architecture where we ditch raster alltogether and use only raytracing cores? If not RTX5000 then RTX6000 series maybe?

If that happens, is it required to rewrite complete shader system/Directx/Vulkan?
 
Last edited:
Joined
May 31, 2016
Messages
4,324 (1.50/day)
Location
Currently Norway
System Name Bro2
Processor Ryzen 5800X
Motherboard Gigabyte X570 Aorus Elite
Cooling Corsair h115i pro rgb
Memory 16GB G.Skill Flare X 3200 CL14 @3800Mhz CL16
Video Card(s) Powercolor 6900 XT Red Devil 1.1v@2400Mhz
Storage M.2 Samsung 970 Evo Plus 500MB/ Samsung 860 Evo 1TB
Display(s) LG 27UD69 UHD / LG 27GN950
Case Fractal Design G
Audio Device(s) Realtec 5.1
Power Supply Seasonic 750W GOLD
Mouse Logitech G402
Keyboard Logitech slim
Software Windows 10 64 bit
I’m curious how you would come to conclusion that Nvidia made the shaders worse with a new architecture? Turing has higher IPC than Pascal in every way, and the clocks are same more or less. The only GPU with straight lower clocks is the 2080 Ti cause it was a huge GPU (still the biggest gaming GPU ever) and had to make due with a “low” tdp of just 280W compared to its size.
If you look at the 2080 ti and 3060 ti. The 3060 has higher clocks and more cores and yet is around 15% slower. Obviously the 2080 Ti is bigger since it's been made on a 12nm node and the 3060 ti 8nm so it is smaller. By the cores and clocks it should have been faster but it isn't. Maybe the memory plays a role here but I doubt it. So there are some sort of limitations implemented I suppose.
Same goes for pascal?
 
Joined
Sep 17, 2014
Messages
20,898 (5.97/day)
Location
The Washing Machine
Processor i7 8700k 4.6Ghz @ 1.24V
Motherboard AsRock Fatal1ty K6 Z370
Cooling beQuiet! Dark Rock Pro 3
Memory 16GB Corsair Vengeance LPX 3200/C16
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Samsung 850 EVO 1TB + Samsung 830 256GB + Crucial BX100 250GB + Toshiba 1TB HDD
Display(s) Gigabyte G34QWC (3440x1440)
Case Fractal Design Define R5
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse XTRFY M42
Keyboard Lenovo Thinkpad Trackpoint II
Software W10 x64
I’m curious how you would come to conclusion that Nvidia made the shaders worse with a new architecture? Turing has higher IPC than Pascal in every way, and the clocks are same more or less. The only GPU with straight lower clocks is the 2080 Ti cause it was a huge GPU (still the biggest gaming GPU ever) and had to make due with a “low” tdp of just 280W compared to its size.
You are correct, I mixed things up. Pascal clocks higher, overall. A good 100 mhz boost advantage is common, often more
 
Last edited:
Top