• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Radeon RX 6000 Series "Big Navi" GPU Features 320 W TGP, 16 Gbps GDDR6 Memory

Joined
Jun 3, 2010
Messages
2,540 (0.50/day)
Currently, RDNA (and GCN) split instructions into two categories: Scalar, and Vector.
There is also the semi-permanent vector operations(vector packed scalars, afaik) which are all the buzz.
Shaders run instructions. I'm not entirely sure what you mean by this.
Frontend and backend are different. The gpu has to decode first, then shaders run them. For the initial period, shaders don't do much. The graphics command processor & workload managers(4 as per each rasterizer) download instructions that shaders will use up.
 
Joined
Mar 10, 2010
Messages
11,878 (2.29/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
There is also the semi-permanent vector operations(vector packed scalars, afaik) which are all the buzz.

Frontend and backend are different. The gpu has to decode first, then shaders run them. For the initial period, shaders don't do much. The graphics command processor & workload managers(4 as per each rasterizer) download instructions that shaders will use up.
Wouldn't there be a flow through the shaders while the decoders work on the next batch and the batch before is returned to memory, except on startup.
I thought GPU were made to stream data in and out not do one job at a time.
The command processor and scheduling keep the flow going..
 
Joined
Apr 24, 2020
Messages
2,569 (1.73/day)
There is also the semi-permanent vector operations(vector packed scalars, afaik) which are all the buzz.

Those are just vector ops from the perspective of the assembly language.

Frontend and backend are different. The gpu has to decode first, then shaders run them. For the initial period, shaders don't do much. The graphics command processor & workload managers(4 as per each rasterizer) download instructions that shaders will use up.

What I'm talking about is in the compute units themselves. See page 12: https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf

1603215114449.png


sALU processes Scalar instructions (loops, branching, booleans), where sGPRs are primarily booleans, but also function-pointers, the call stack, and things of that nature.

vALUs process vector instructions, which include those "packed" instructions. If we wanted to get more specific, there are also LDS, load/store, and DPP instructions going to different units. But by and large, the two instructions that constitute the majority of AMD GPUs are classified as vector, or scalar.

You're right in that the fixed-function pipeline (not shown in the above diagram), in particular rasterization ("ROPs") constitute a significant portion of the modern GPU. But you can see that the command-processor is very far away from the vALUs / sALUs inside of the compute units.

Wouldn't there be a flow through the shaders while the decoders work on the next batch and the batch before is returned to memory, except on startup.
I thought GPU were made to stream data in and out not do one job at a time.
The command processor and scheduling keep the flow going..

AMD's command processors are poorly documented. I can't find anything that describes their operation very well. (Well... I could read the ROCm source code, but I'm not THAT curious...)

But from my understanding: the command processor simply launches wavefronts. That is: it sets up the initial sGPRs for a workgroup (x, y, and z coordinate of the block), as well as VGPR0, VGPR1, and VGPR2 (for the x, y, and z coordinate of the thread). Additional parameters go into sGPRs (shared between all threads). Then, it issues a command to jump (or function call) the compute unit to a location in memory. AMD command processors have a significant amount of hardware scheduling logic for events and ordering of wavefronts: priorities and the like.

But the shader has already been converted into machine code by the OpenCL or Vulkan or DirectX driver, and loaded somewhere. The command processor only has to setup the parameters, and issue a jump command to get a compute unit to that code (once all synchronization functions, such as OpenCL Events, have proven that this particular wavefront is ready to run).
 
Last edited:

Cheeseball

Not a Potato
Supporter
Joined
Jan 2, 2009
Messages
1,859 (0.33/day)
Location
Pittsburgh, PA
System Name Titan
Processor AMD Ryzen™ 7 7950X3D
Motherboard ASUS ROG Strix X670E-I Gaming WiFi
Cooling ID-COOLING SE-207-XT Slim Snow
Memory TEAMGROUP T-Force Delta RGB 2x16GB DDR5-6000 CL30
Video Card(s) ASRock Radeon RX 7900 XTX 24 GB GDDR6 (MBA)
Storage 2TB Samsung 990 Pro NVMe
Display(s) AOpen Fire Legend 24" (25XV2Q), Dough Spectrum One 27" (Glossy), LG C4 42" (OLED42C4PUA)
Case ASUS Prime AP201 33L White
Audio Device(s) Kanto Audio YU2 and SUB8 Desktop Speakers and Subwoofer, Cloud Alpha Wireless
Power Supply Corsair SF1000L
Mouse Logitech Pro Superlight (White), G303 Shroud Edition
Keyboard Wooting 60HE / NuPhy Air75 v2
VR HMD Occulus Quest 2 128GB
Software Windows 11 Pro 64-bit 23H2 Build 22631.3447
I'll be trying out the RDNA2 cards for the exact same reason as you. 250W max in my HTPC but the reason I'm back to Nvidia in the HTPC at the moment is the AMD HDMI audio driver cutting out with Navi cards. Didn't happen when I swapped to an RX480 or a 2060S, but when I tried a vanilla 5700 the exact same bug reappeared. A microsoft update was the trigger but AMD haven't put out a fix yet and after 3 months I got bored of watching the thread of people complaining on AMD's forum get longer without acknowledgement and moved on.

The new 20.10.1 driver seems to address the HDMI audio issue with AV receivers. I have not tested this on the RX 5700 XT and Onkyo yet.
 
Joined
Sep 3, 2019
Messages
3,020 (1.75/day)
Location
Thessaloniki, Greece
System Name PC on since Aug 2019, 1st CPU R5 3600 + ASUS ROG RX580 8GB >> MSI Gaming X RX5700XT (Jan 2020)
Processor Ryzen 9 5900X (July 2022), 150W PPT limit, 79C temp limit, CO -9~14
Motherboard Gigabyte X570 Aorus Pro (Rev1.0), BIOS F37h, AGESA V2 1.2.0.B
Cooling Arctic Liquid Freezer II 420mm Rev7 with off center mount for Ryzen, TIM: Kryonaut
Memory 2x16GB G.Skill Trident Z Neo GTZN (July 2022) 3600MHz 1.42V CL16-16-16-16-32-48 1T, tRFC:288, B-die
Video Card(s) Sapphire Nitro+ RX 7900XTX (Dec 2023) 314~465W (390W current) PowerLimit, 1060mV, Adrenalin v24.5.1
Storage Samsung NVMe: 980Pro 1TB(OS 2022), 970Pro 512GB(2019) / SATA-III: 850Pro 1TB(2015) 860Evo 1TB(2020)
Display(s) Dell Alienware AW3423DW 34" QD-OLED curved (1800R), 3440x1440 144Hz (max 175Hz) HDR1000, VRR on
Case None... naked on desk
Audio Device(s) Astro A50 headset
Power Supply Corsair HX750i, 80+ Platinum, 93% (250~700W), modular, single/dual rail (switch)
Mouse Logitech MX Master (Gen1)
Keyboard Logitech G15 (Gen2) w/ LCDSirReal applet
Software Windows 11 Home 64bit (v23H2, OSB 22631.3155)
Performance per watt did go up on Ampere, but that's to be expected given that Nvidia moved from TSMCs 12nm to Samsung’s 8nm 8LPP, a 10nm extension node. What is not impressive is only 10% performance per watt increase over Turing while being build on 25% denser node. RDNA2 arch being on 7 nm+ looks to be even worse efficiency wise given that density of 7nm+ is much higher, but let's wait for the actual benchmarks.
You can’t actually compare nodes nor estimate performance/W gains by node shrink alone. Look at ZEN3. On exact same node of ZEN2 has a 20% better perf/W just by architectural improvements. Don’t confuse this with higher IPC on same speed. If you increase IPC alone without perf/W improvements the power consumption goes up. It’s physics. It’s not only clock that draws power.

RDNA2 is on a better (from RDNA1) 7nm node (the 7NP DUV and not 7nm+ EUV) that (by rumors) offers a 10-15% higher density and combined with the improvements in RDNA2 architecture it is “said” to have +50% better perf/W.

If true, where exactly is going to place 6900 against Ampere, is yet to be seen

It seems like Ampere so far is and the 5700XT... clocked out of their efficiency curves... and it seems the same with RDNA2 with the rumors so far...I don't think any of the AMD fanatics saw similar power envelopes coming (they are awfully quiet here... go figure) and here we are.
I was expecting it... the 300~320W TBP. It couldn’t be anything else in order to offer similar 3080 perf. Less watts didn’t add up, and why AMD shouldn’t use all Watts up to Ampere. Again, my thoughts.

—————————————

Personally I don’t care about a GPU drawing 350 or 400W. I used to have a R9 390X OC model with 2.5 slot cooler and it was just fine. That was rated 375W TBP.
The 5700XT now is more than x2 the perf with 240W peaks and 220W avg power draw.

Every flagship GPU is set to work (when maxed) out of the efficiency curve. Unless there is no competition.

AMD Drivers, except from power, perf bars, also offer the function “chill”. You can set a min/max FPS target. In most games if I use this feature to cap FPS at min/max 40/60 the avg draw of the card is less than 100W.
60Hz is my monitor, and that the target within movement. If you stop moving in game the FPS drops to 40. I can set it 60/60 if I like.
My monitor is 13,5year old 1920x1200 16:10 and I was planing to switch to ultra wide 6 months ago but the human malware changed that, along other aspects of my(our) life(s).

There is no point for me to complain about the amount of power GPU are drawing. Buy a lower tier model. And perf/W is a continuously improved matter. We just can’t use flagship models as examples, sometimes.
 
Joined
Apr 24, 2020
Messages
2,569 (1.73/day)
The new 20.10.1 driver seems to address the HDMI audio issue with AV receivers. I have not tested this on the RX 5700 XT and Onkyo yet.

My real issue with the RX 5700 XT series is the lack of ROCm support.

Unofficial ROCm is beginning to happen in ROCm 3.7 (released in August 2020). But there's been over a year where compute-fans were unable to use ROCm at all on NAVI. To be fair: AMD never promised ROCm support on all of their cards. But it really knocks the wind from people's sails when they're unable to "play" with their cards. Even older cards like the RX 550 never really got ROCm support (only RX 580 got official support).

For now, my recommendation for AMD GPU-compute fans is to read the documentation carefully before buying. Wait for a Radeon Machine Intelligence card, like MI25 (aka: Vega64) to come out before buying that model. AMD ROCm is clearly aimed at their MI-platform and not really their consumer cards. MI8 (aka: RX 580) and MI6 (aka: Rx Fury) have good support, but not necessarily other cards.

---------

ROCm suddenly getting support for NAVI in 3.7 suggests that this new NAVI 2x series might have a MI-card in the works, and therefore might be compatible with ROCm.
 
Joined
Jan 14, 2019
Messages
10,062 (5.15/day)
Location
Midlands, UK
System Name Holiday Season Budget Computer (HSBC)
Processor AMD Ryzen 7 7700X
Motherboard MSi PRO B650M-A WiFi
Cooling be quiet! Dark Rock 4
Memory 2x 16 GB Corsair Vengeance EXPO DDR5-6000
Video Card(s) Sapphire Pulse Radeon RX 6500 XT 4 GB
Storage 2 TB Corsair MP600 GS, 2 TB Corsair MP600 R2, 4 + 8 TB Seagate Barracuda 3.5"
Display(s) Dell S3422DWG, 7" Waveshare touchscreen
Case Kolink Citadel Mesh black
Audio Device(s) Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply Seasonic Prime GX-750
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Windows 10 Pro
It's a concern for me, UK power isn't cheap, having said that ,as you say power use depends on load, and few cards use flat out power much of the day, even folding at home or mining doesn't Max a cards power use in reality.
Still, Some game's are going to cook people while gaming, warm winter perhaps, hopefully that looto tickets not as shit as all my last one's.
I just made my calculations a few posts above yours. If you pay 20p per kWh, then 2 hours of gaming (or folding, or whatever) every day on a computer that eats 300 W more than the one you currently own will increase your bills by £3.65 a month! If you fold 24/7, fair enough, but other than that, I wouldn't worry too much.
 
Joined
Sep 17, 2014
Messages
21,099 (5.97/day)
Location
The Washing Machine
Processor i7 8700k 4.6Ghz @ 1.24V
Motherboard AsRock Fatal1ty K6 Z370
Cooling beQuiet! Dark Rock Pro 3
Memory 16GB Corsair Vengeance LPX 3200/C16
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Samsung 850 EVO 1TB + Samsung 830 256GB + Crucial BX100 250GB + Toshiba 1TB HDD
Display(s) Gigabyte G34QWC (3440x1440)
Case Fractal Design Define R5
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse XTRFY M42
Keyboard Lenovo Thinkpad Trackpoint II
Software W10 x64
I just made my calculations a few posts above yours. If you pay 20p per kWh, then 2 hours of gaming (or folding, or whatever) every day with a computer that eats 300 W more than the one you currently own will increase your bills by £3.65 a month! If you fold 24/7, fair enough, but other than that, I wouldn't worry too much.

3,5 pounds is a few pints in the pub you cant go to. Definitely worth considering I say
 
Joined
Mar 10, 2010
Messages
11,878 (2.29/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
Those are just vector ops from the perspective of the assembly language.



What I'm talking about is in the compute units themselves. See page 12: https://developer.amd.com/wp-content/resources/Vega_Shader_ISA_28July2017.pdf

View attachment 172588

sALU processes Scalar instructions (loops, branching, booleans), where sGPRs are primarily booleans, but also function-pointers, the call stack, and things of that nature.

vALUs process vector instructions, which include those "packed" instructions. If we wanted to get more specific, there are also LDS, load/store, and DPP instructions going to different units. But by and large, the two instructions that constitute the majority of AMD GPUs are classified as vector, or scalar.

You're right in that the fixed-function pipeline (not shown in the above diagram), in particular rasterization ("ROPs") constitute a significant portion of the modern GPU. But you can see that the command-processor is very far away from the vALUs / sALUs inside of the compute units.



AMD's command processors are poorly documented. I can't find anything that describes their operation very well. (Well... I could read the ROCm source code, but I'm not THAT curious...)

But from my understanding: the command processor simply launches wavefronts. That is: it sets up the initial sGPRs for a workgroup (x, y, and z coordinate of the block), as well as VGPR0, VGPR1, and VGPR2 (for the x, y, and z coordinate of the thread). Additional parameters go into sGPRs (shared between all threads). Then, it issues a command to jump (or function call) the compute unit to a location in memory. AMD command processors have a significant amount of hardware scheduling logic for events and ordering of wavefronts: priorities and the like.

But the shader has already been converted into machine code by the OpenCL or Vulkan or DirectX driver, and loaded somewhere. The command processor only has to setup the parameters, and issue a jump command to get a compute unit to that code (once all synchronization functions, such as OpenCL Events, have proven that this particular wavefront is ready to run).
Sooo, work Does flow through then? Lol Ty.
 
Joined
Dec 29, 2010
Messages
3,522 (0.72/day)
Processor AMD 5900x
Motherboard Asus x570 Strix-E
Cooling Hardware Labs
Memory G.Skill 4000c17 2x16gb
Video Card(s) RTX 3090
Storage Sabrent
Display(s) Samsung G9
Case Phanteks 719
Audio Device(s) Fiio K5 Pro
Power Supply EVGA 1000 P2
Mouse Logitech G600
Keyboard Corsair K95
Does anybody not care about electricity bills anymore, or most not having responsibilty to pay the bills? Who would buy these cards?

That's ironic...

It is expected and rightly deserved by AMD, I expect this tike around both their CPU & GPU are both fully matured and we wont see the bios and software issues that happened last year at least not in that scale. However, when it comes to power consumption i do not believe the 65w. It will consume just as much as i5 10600k.

The same ppl who buy 10600K cpus who think they only use 125w?? The same ppl who think they're saving the world by consuming less power but are actually running at PL2 all the time thus consuming way more power?
 
Joined
Mar 10, 2010
Messages
11,878 (2.29/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R5 5900X/ Intel 8750H
Motherboard Crosshair hero8 impact/Asus
Cooling 360EK extreme rad+ 360$EK slim all push, cpu ek suprim Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in four sticks./16Gb/16GB
Video Card(s) Powercolour RX7900XT Reference/Rtx 2060
Storage Silicon power 2TB nvme/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd/1Tb nvme
Display(s) Samsung UAE28"850R 4k freesync.dell shiter
Case Lianli 011 dynamic/strix scar2
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi/Asus stock
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
VR HMD Oculus rift
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
I just made my calculations a few posts above yours. If you pay 20p per kWh, then 2 hours of gaming (or folding, or whatever) every day on a computer that eats 300 W more than the one you currently own will increase your bills by £3.65 a month! If you fold 24/7, fair enough, but other than that, I wouldn't worry too much.
Technically I don't directly pay the bill ;) she does ;) damn electric company :D.
 
Joined
Jan 3, 2015
Messages
2,902 (0.85/day)
System Name The beast and the little runt.
Processor Ryzen 5 5600X - Ryzen 9 5950X
Motherboard ASUS ROG STRIX B550-I GAMING - ASUS ROG Crosshair VIII Dark Hero X570
Cooling Noctua NH-L9x65 SE-AM4a - NH-D15 chromax.black with IPPC Industrial 3000 RPM 120/140 MM fans.
Memory G.SKILL TRIDENT Z ROYAL GOLD/SILVER 32 GB (2 x 16 GB and 4 x 8 GB) 3600 MHz CL14-15-15-35 1.45 volts
Video Card(s) GIGABYTE RTX 4060 OC LOW PROFILE - GIGABYTE RTX 4090 GAMING OC
Storage Samsung 980 PRO 1 TB + 2 TB - Samsung 870 EVO 4 TB - 2 x WD RED PRO 16 GB + WD ULTRASTAR 22 TB
Display(s) Asus 27" TUF VG27AQL1A and a Dell 24" for dual setup
Case Phanteks Enthoo 719/LUXE 2 BLACK
Audio Device(s) Onboard on both boards
Power Supply Phanteks Revolt X 1200W
Mouse Logitech G903 Lightspeed Wireless Gaming Mouse
Keyboard Logitech G910 Orion Spectrum
Software WINDOWS 10 PRO 64 BITS on both systems
Benchmark Scores Se more about my 2 in 1 system here: kortlink.dk/2ca4x
320 watt is a lot what ever it's nvidia or amd. But for for those that might don't want there card to consume 320 watt+ all the time and I am one of them.

There are ways to to keep consumption down. You can limit fps in some games, you can activate v-sync so you have 60 fps and keep's gpu load down or you can download example msi afterburner. There is this little slider Called power target. With that you can limit the maximum power the card is allowed to to use. What I know, rtx 3080 can be limited all the way down to only 100 watt. Also under volting can save you some watt. Again it seems rtx 3080 can be good for up to 100 watt saving just by limiting max voltage to gpu, with out offering to much performance loss. I used the power target slider for years to adjust a fitting power consumption.

I am not expecting to get RDNA2 based card. But I do hope amd can still provide a good amount of resistance to rtx 3080, cause we all know. Competition is good for consumer pricing.
 
Joined
Jun 3, 2010
Messages
2,540 (0.50/day)
But you can see that the command-processor is very far away from the vALUs / sALUs inside of the compute units.
Well, far or near, on time scale they are consecutively placed, one precedes the other which puts the pressure on gcn frontend.
vALUs process vector instructions, which include those "packed" instructions.
Semi persistent stuff are scalar timed vector ops which save on critical timing. It consumes vector memory in a scalar fashion which saves on decode latency since it follows the developer's instruction and allows for lane instrinsics and full memory utilization.
But from my understanding: the command processor simply launches wavefronts.
Yes. There are 2560 wavefronts in a CU and there are 64 CU's per command processor. It takes 64 cycles for each CU to get 1 operation workgroup issued and thereafter 64 cycles for every wave per CU. It takes a lot of time until shaders are fully operational.
The command processor only has to issue a jump command to get a compute unit to that code.
20201020_210925.jpg
 
Joined
Dec 31, 2009
Messages
19,366 (3.69/day)
Benchmark Scores Faster than yours... I'd bet on it. :)
You pay for a bigger and wider GPU. Effectively: you're paying for the silicon (as well as the size of a successful die. The larger the die, the harder it is to produce and naturally, the more expensive it is).

Whether you run it at maximum power, or minimum power, is up to you. Laptop chips, such as the Laptop RTX 2070 Super, are effectively underclocked versions of the desktop chip. The same thing, just running at lower power (and greater energy efficiency) for portability reasons. Similarly, a mini-PC user may have a harder time cooling down their computer, or maybe a silent-build wants to reduce the fan noise.

A wider GPU (ex: 3090) will still provide more power-efficiency than a narrower GPU (ex: 3070), even if you downclock a 3090 to 3080 or 3070 levels. More performance at the same levels of power, that's the main benefit of "more silicon".

--------

Power consumption is something like the voltage-cubed (!!!). If you reduce voltage by 10%, you get something like 30% less power draw. Dropping 10% of your voltage causes a 10% loss of frequency, but you drop in power-usage by a far greater number.
we can think of situations where it cod be worthwhile. Im not talking about shoehorning these things in tiny boxes, etc. We can all think of exceptions. ;)
 
Joined
Jan 14, 2019
Messages
10,062 (5.15/day)
Location
Midlands, UK
System Name Holiday Season Budget Computer (HSBC)
Processor AMD Ryzen 7 7700X
Motherboard MSi PRO B650M-A WiFi
Cooling be quiet! Dark Rock 4
Memory 2x 16 GB Corsair Vengeance EXPO DDR5-6000
Video Card(s) Sapphire Pulse Radeon RX 6500 XT 4 GB
Storage 2 TB Corsair MP600 GS, 2 TB Corsair MP600 R2, 4 + 8 TB Seagate Barracuda 3.5"
Display(s) Dell S3422DWG, 7" Waveshare touchscreen
Case Kolink Citadel Mesh black
Audio Device(s) Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply Seasonic Prime GX-750
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Windows 10 Pro
Technically I don't directly pay the bill ;) she does ;) damn electric company :D.
And in my case, she shares the costs of living, so with my calculations, I would only pay £1.82 more per month. :laugh: I really don't understand why some of you guys are so scared of power-hungry PC components (unless you run your PCs on full load 24/7).
 
Joined
Oct 10, 2018
Messages
943 (0.46/day)
I don't think I've ever once thought of the electricity bill when it comes to computers, except when it comes to convincing the wife that upgrading will actually SAVE us money. "Honey, it literally pays for itself!"

We have an 18k BTU mini-split that runs in our 1300 sq. ft. garage virtually 24/7. The 3-5 PC's in the home that run at any given moment are NOTHING compared to that.

When I buy a new system, I have to pay wifey tax. Which in the end costs me the same as a new system and in many cases more! But at least, everyone is happy.
 
Joined
Apr 24, 2020
Messages
2,569 (1.73/day)
Yes. There are 2560 wavefronts in a CU and there are 64 CU's per command processor. It takes 64 cycles for each CU to get 1 operation workgroup issued and thereafter 64 cycles for every wave per CU. It takes a lot of time until shaders are fully operational.

By my tests, it takes 750 clock cycles to read a single wavefront's worth of data from VRAM (64x32 bit reads). So on the timescales of computations, 64 cycles isn't very much. Its certainly non-negligible, but I expect that the typical shader will at least read one value of memory, then write one value of memory (or take 1500 clocks), plus all of the math operations it has to do. If you're doing heavy math, that will only increase the number of cycles per shader.

If you are shader-launch constrained, it isn't a big deal to have a for(int i=0; i<16; i++){} statement wrapping your shader code. Just loop your shader 16 times before returning.


Yeah, I remember seeing the slide but I couldn't remember where to find it. Thanks for the reminder. You'd think something like that would be in the ISA. Really, AMD needs to put out a new optimization guide that contains information like this (which they haven't written one since the 7950 series)
 
Joined
Feb 21, 2006
Messages
2,007 (0.30/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Prime X570-Pro BIOS 5013 AM4 AGESA V2 PI 1.2.0.Ca.
Cooling Corsair H150i Pro
Memory 32GB GSkill Trident RGB DDR4-3200 14-14-14-34-1T (B-Die)
Video Card(s) AMD Radeon RX 7900 XTX 24GB (24.5.1)
Storage WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 14TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB SE
Keyboard Corsair K100
Software Windows 10 Pro x64 22H2
Benchmark Scores 3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c
Looks like this gen of GPUs are all power-hungry. Efficiency is out of the window!

Pretty much to be expected everyone is trying to make 4k playable with the new hardware and was not going to happen on a low power budget.

Does anybody not care about electricity bills anymore, or most not having responsibilty to pay the bills? Who would buy these cards?

You assume they cannot afford it and that everyone pays the same price for power.
 
Last edited:
Joined
Oct 10, 2018
Messages
943 (0.46/day)
That's ironic...



The same ppl who buy 10600K cpus who think they only use 125w?? The same ppl who think they're saving the world by consuming less power but are actually running at PL2 all the time thus consuming way more power?
Flattering to see you search my posts in other topics but I still do not understand what you tried to say here? perhaps i just had a very boring day at work and not focused enough...
 
Joined
Jun 3, 2010
Messages
2,540 (0.50/day)
Yeah, I remember seeing the slide but I couldn't remember where to find it.
It is the 'engine optimization hot lap' by Timothy Lottes.
By my tests, it takes 750 clock cycles to read a single wavefront's worth of data from VRAM (64x32 bit reads). So on the timescales of computations, 64 cycles isn't very much. Its certainly non-negligible, but I expect that the typical shader will at least read one value of memory, then write one value of memory (or take 1500 clocks), plus all of the math operations it has to do. If you're doing heavy math, that will only increase the number of cycles per shader.
slide_8.jpg

slide_13.jpg


There is also, "AMD GPU Hardware Basics".
Basically, a hodge-podge of why we cannot keep gpus on duty. Pretty funny stuff, an engineered list of excuses why their hardware don't work.
 
Last edited:
Joined
Dec 29, 2010
Messages
3,522 (0.72/day)
Processor AMD 5900x
Motherboard Asus x570 Strix-E
Cooling Hardware Labs
Memory G.Skill 4000c17 2x16gb
Video Card(s) RTX 3090
Storage Sabrent
Display(s) Samsung G9
Case Phanteks 719
Audio Device(s) Fiio K5 Pro
Power Supply EVGA 1000 P2
Mouse Logitech G600
Keyboard Corsair K95
Flattering to see you search my posts in other topics but I still do not understand what you tried to say here? perhaps i just had a very boring day at work and not focused enough...

Nah I was just reading the other thread and flabbergasted at how misinformed you are and it was ironic to see it here.

Pretty much to be expected everyone is trying to make 4k playable with the new hardware and was not going to happen on a low power budget.

Good point, especially considering the the megapixel density is four fold at 4k vs 1080. These new gpus' power draws have not risen at the same increment as the megapixel density.
 
Joined
Oct 10, 2018
Messages
943 (0.46/day)
Pretty much to be expected everyone is trying to make 4k playable with the new hardware and was not going to happen on a low power budget.

I can not speak about new Radeons yet, but with ampere, I would have settled for 25% improvement in performance whilst keeping the same power envelope.
Of course, there will be many people happy with the current situation as well. So, I can see and understand both sides of the argument.
 
Joined
Sep 17, 2019
Messages
452 (0.26/day)
Well it looks like I'll be keeping my 5700 for awhile. MY entire load when running with this card is 247watts max from the wall outlet. Do you think I'm going to buy a video card that is going to add another 200 watts without a 150% increase in performance???

Absolutely not. I bitched about Nvidia and their wattage vs performance and when this card comes out I will bitch about that one too.
This has nor ever will be a Nivida vs AMD. This is and always will be the Best bang for the buck
 
Joined
Feb 18, 2017
Messages
688 (0.26/day)
Here it's 255W.

You get more framerate with 3080 despite the insane power draw, some said it's the most-power-per-frame-efficient GPU out there.

But, I agree with you... I wish they could have made something that less hungrier. Imagine how amazing it would be if we could get <200W card that can beat 2080 Ti.

Of course it's the most power-per-frame-efficient GPU, but if you compare it against the 1080-980, the 1080 gained near equal performance as the 3080 (a bit more for the 1080), the efficiency gain there was more than 3x bigger (18% vs. 59%).
 
Last edited:
Joined
Dec 29, 2010
Messages
3,522 (0.72/day)
Processor AMD 5900x
Motherboard Asus x570 Strix-E
Cooling Hardware Labs
Memory G.Skill 4000c17 2x16gb
Video Card(s) RTX 3090
Storage Sabrent
Display(s) Samsung G9
Case Phanteks 719
Audio Device(s) Fiio K5 Pro
Power Supply EVGA 1000 P2
Mouse Logitech G600
Keyboard Corsair K95

Interesting. Igor's calculating what they expect it to be. The tweet source just lists TGP not TBP which is what Igor's revised list is. Both are generally speaking in line with each other.

Again, the power draw numbers are not real so relax until we have actual real numbers. But don't be surprised they will be in the same range as Nvidia's because it will take MORE POWER to run realistic framerates at 4K because the pixel density is really steep!
 
Top