Monday, May 13th 2024
AMD RDNA 5 a "Clean Sheet" Graphics Architecture, RDNA 4 Merely Corrects a Bug Over RDNA 3
AMD's future RDNA 5 graphics architecture will bear a "clean sheet" design, and may probably not even have the RDNA branding, says WJM47196, a source of AMD leaks on ChipHell. Two generations ahead of the current RDNA 3 architecture powering the Radeon RX 7000 series discrete GPUs, RDNA 5 could see AMD reimagine the GPU and its key components, much in the same way RDNA did over the former "Vega" architecture, bringing in a significant performance/watt jump, which AMD could build upon with its successful RDNA 2 powered Radeon RX 6000 series.
Performance per Watt is the biggest metric on which a generation of GPUs can be assessed, and analysts believe that RDNA 3 missed the mark with generational gains in performance/watt despite the switch to the advanced 5 nm EUV process from the 7 nm DUV. AMD's decision to disaggregate the GPU, with some of its components being built on the older 6 nm node may have also impacted the performance/watt curve. The leaker also makes a sensational claim that "Navi 31" was originally supposed to feature 192 MB of Infinity Cache, which would have meant 32 MB segments of it per memory cache die (MCD). The company instead went with 16 MB per MCD, or just 96 MB per GPU, which only get reduced as AMD segmented the RX 7900 XT and RX 7900 GRE by disabling one or two MCDs.The upcoming RDNA 4 architecture will correct some of the glaring component level problems causing the performance/Watt curve to waver on RDNA 3; and the top RDNA 4 part could end up with performance comparable to the current RX 7900 series, while being from a segment lower, and a smaller GPU overall. In case you missed it, AMD will not make a big GPU that succeeds the "Navi 31" and "Navi 21" for the RDNA 4 generation, but rather focus on the performance segment, offering more bang for the buck well under the $800-mark, so it could claw back some market share from NVIDIA in the performance- mid-range, and mainstream product segments. While it remains to be seen if RDNA 5 will get AMD back into the enthusiast segment, it is expected to bring a significant gain in performance due to the re-architected design.
One rumored aspect of RDNA 4 that even this source agrees with, is that AMD is working to significantly improve its performance with ray tracing workloads, by redesigning its hardware. While RDNA 3 builds on the Ray Accelerator component AMD introduced with RDNA 2, with certain optimizations yielding a 50% generational improvement in ray testing and intersection performance; RDNA 4 could see AMD put more of the ray tracing workload through fixed-function accelerators, unburdening the shader engines. This significant improvement in ray tracing performance, performance/watt improvements at an architectural level, and the switch to a newer foundry node such as 4 nm or 3 nm, is how AMD ends up with a new generation on its hands.
AMD is expected to unveil RDNA 4 this year, and if we're lucky, we might see a teaser at the 2024 Computex, next month.
Sources:
wjm47196 (ChipHell), VideoCardz
Performance per Watt is the biggest metric on which a generation of GPUs can be assessed, and analysts believe that RDNA 3 missed the mark with generational gains in performance/watt despite the switch to the advanced 5 nm EUV process from the 7 nm DUV. AMD's decision to disaggregate the GPU, with some of its components being built on the older 6 nm node may have also impacted the performance/watt curve. The leaker also makes a sensational claim that "Navi 31" was originally supposed to feature 192 MB of Infinity Cache, which would have meant 32 MB segments of it per memory cache die (MCD). The company instead went with 16 MB per MCD, or just 96 MB per GPU, which only get reduced as AMD segmented the RX 7900 XT and RX 7900 GRE by disabling one or two MCDs.The upcoming RDNA 4 architecture will correct some of the glaring component level problems causing the performance/Watt curve to waver on RDNA 3; and the top RDNA 4 part could end up with performance comparable to the current RX 7900 series, while being from a segment lower, and a smaller GPU overall. In case you missed it, AMD will not make a big GPU that succeeds the "Navi 31" and "Navi 21" for the RDNA 4 generation, but rather focus on the performance segment, offering more bang for the buck well under the $800-mark, so it could claw back some market share from NVIDIA in the performance- mid-range, and mainstream product segments. While it remains to be seen if RDNA 5 will get AMD back into the enthusiast segment, it is expected to bring a significant gain in performance due to the re-architected design.
One rumored aspect of RDNA 4 that even this source agrees with, is that AMD is working to significantly improve its performance with ray tracing workloads, by redesigning its hardware. While RDNA 3 builds on the Ray Accelerator component AMD introduced with RDNA 2, with certain optimizations yielding a 50% generational improvement in ray testing and intersection performance; RDNA 4 could see AMD put more of the ray tracing workload through fixed-function accelerators, unburdening the shader engines. This significant improvement in ray tracing performance, performance/watt improvements at an architectural level, and the switch to a newer foundry node such as 4 nm or 3 nm, is how AMD ends up with a new generation on its hands.
AMD is expected to unveil RDNA 4 this year, and if we're lucky, we might see a teaser at the 2024 Computex, next month.
150 Comments on AMD RDNA 5 a "Clean Sheet" Graphics Architecture, RDNA 4 Merely Corrects a Bug Over RDNA 3
Also, they can compensate by using faster architectures on older/cheaper processes?
I still don't know why they haven't released a pipecleaner, 150 mm^2 GPU built on the newer TSMC N4 or TSMC N3 processes?
Bleeding edge manufacturing nodes, and the price-bidding war to win allocation means that N4 and N3 are an order of magnitude more expensive than the interposer/assembly costs. Those are rapidly becoming irrelevant, too - since AMD have been doing it for so long that it's a solved problem with plenty of experience and a relatively smooth/effortless process now.
They won't release lower-end parts on N4 and N3 simply because the profit margins for those lower end, smaller dies don't actually merit the high cost AMD pays TSMC for the flagship nodes.
You see, AMD took the approach, of designing the top architecture, and then using it for derivative product ranges. This is the VAG of silicon. So the EPYC/Instinct is MAN/Bugatti, while Ryzen and Radeon are A6, Passat, Golf and Polo; and Threadripper/Radeon Pro are somewhat between Rolls-Royce and Crafter. And this is brilliant strategy, to be honest. ANd this is why AMD holds on to it so much, because it brought them the fortune, they never ever had before. That's why, I think, that AMD isn't going to cut MCM/MCD design for consumer grade cards (with possibility of the lower tier chips joining MCM design as well), by improving it instead, akin how Intel holds for Arc. Because it's much cheaper and easier to keep the same approach for all products, and just rectify the sissues, rayther then dedicate the budget for development of separate architecture.
So I guess, that although the CDNA and RDNA architectures are different, the ideas, technology and design, and execution might have many in common, sans video output.
Thus the problem might be specifically with maintaining it for "multipurpose"/gaming use, where the frequencies are higher and load is variable. So the strain on the hardware is not constant and can easilly exceede the chip/link capabilities during load spikes. Thsese are just layman's speculations, but I hope you got the point.
GPU market is in such a sad state in the last ~5 years...
The cost overhead of chiplets is vastly outweighed by the cost savings. By splitting a design into smaller chiplets you increase the number of good chips yieled per wafer. The exact increase depends on the defect density of the node but as you increase defect density the greater the benefit chiplets have. Even at TSMC's 5nm's defect density of 0.1 per square cm the number of additional chips yielded is significant, let alone 3nm which TSMC is currently having issues yield wise. This goes triple for GPUs, which are huge dies that disproportionately benefit from the disaggregation that chiplets bring.
In essence you are weighing the cost of wasted silicon compared to the added cost of a silicon interposer. I managed to find an estimate from 2018 which places the cost between $30 (for a medium sized chip) and $100 (for an interposer a multiple of reticle size stitched together): www.frost.com/frost-perspectives/overview-interposer-technology-packaging-applications/
AMD's desktop CPUs qualify below that lower figure and flaghip GPUs (600mm2+) likely sit above the middle at $70-80. I would not be surprised if those costs have gone down for dumb interposers since that was published (not CoWoS though, which is in high demand).
Also consider that chiplets allow you to use older processes for certain parts of the chip for additional savings and you only have to design just the chiplets that will then be used in every SKU in your lineup, both of which will influence total cost to manufacture in a positive way.
It's about making decisions.
Maybe AMD must put on the table the profit margins, and instead start thinking about the gamers?
GPU for high-fps gaming is extremely latency-sensitive, so the latency penalty of chiplet MCM is 100% a total dealbreaker.
AMD hasn't solved/evolved the inter-chiplet latency well enough for them to be suitable for a real-time graphics pipeline yet, but that doesn't mean they won't.
$NVDA is priced as is because they provide both the hardware and software tools for AI companies to develop their products. OpenAI for example is a private corporation (similar to Valve), and AI is widely considered to be in its infancy. It's the one lesson not to mock a solid ecosystem.
It's simply a question of cost because low end parts need to be cheap, which means using expensive nodes for them makes absolutely zero sense.
I can confidently say that it's not happened in the entire history of AMD graphics cards, going back to the early ATi Mach cards, 35 years ago!
en.wikipedia.org/wiki/List_of_AMD_graphics_processing_units#Desktop_GPUs
Look at the column for manufacturing node; The low end of each generation is always last years product rebranded, or - if it's actually a new product rather than a rebrand - it's always an older process node to save money.
So yes, please drop it. I don't know how I can explain it any more clearly to you. Low end parts don't get made on top-tier, expensive, flagship manufacturing nodes, because it's simply not economically viable. Companies aiming to make a profit will not waste their limited quantity of flagship node wafer allocations on low-end shit - that would be corporate suicide!
If Pirelli came accross a super-rare, super-expensive, extra-sticky rubber but there was a limited quantity of the stuff - they could use it to make 1000 of the best Formula 1 racing tyres ever seen and give their brand a huge marketing boost and recognition, OR they could waste it making 5000 more boring, cheap, everyday tyres for commuter workhorse cars like your grandma's Honda Civic.
Saying never, means that you must have an alternative in mind? What's it? Making RX 7600 on 6nm for 20 years more?
www.anandtech.com/show/21371/tsmc-preps-lower-cost-4nm-n4c-process-for-2025
www.linkedin.com/pulse/tech-news-mature-process-node-wafer-foundry-prices-set-drop-tiroc
Nvidia left the door wide open this generation for AMD and they are like nah we love being stuck as an insignificant % of the market. It's really a total opposite of how AMD handled Zen.
We need both these companies pushing each other to make better products but if RDNA5 is a bust like 3 I'm not sure CDNA can save the whole gpu side at amd... Maybe we are just seeing the ceiling for an AMD branded gpu regardless of how good of a product amd makes.
Who knows maybe Nvidia will open the door even wider next generation been hearing 1200 ish for a 5080 that only offers 4090 performance with less Vram which would be a pretty terrible product.
Sure, that'll eventually happen. That's where N6 is right now - but it's not relevant to this discussion, is it?
Is it fine for AMD to get so rare gamers' purchases? If so, then it's fine.
But it would mean that the niche market will not hold for many more years. No reason to upgrade.
It's so easy for AMD and Nvidia to figure out the minimum prices of their competitor, given that they share the same chip manufacturer (TSMC), same GDDR manufacturer (Samsung), same PCB manufacturer (AIBs).
Who know perhaps Nvidia will charge higher margins next-gen, just so Radeon can improve their terrible margins.