• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

AMD to Build Next-Gen I/O Dies on Samsung 4nm, Not TSMC N4P

This won't save you from link training times on AM5.


No, but I can believe they'd compete with clientside Intel which is spanking them in this area. It's not like I am asking for more cores, threads, pcie lanes or memory channels.

I'm not complaining. Honestly AM5 still does what I need. But I do see spots for improvement.
Maybe, hopefully with zen6 at least. I'm not getting AM5 anytime soon & the way AMD is doing things these days they'll probably cede a lot of ground to Apple. The reason people still go with AMD is of course "PC" and the ability to upgrade with substantial improvements in say two generations. But if it's that much of a headache then going with fruity loops is not a bad option. AMD is trashing a lot of goodwill they garnered post Zen rather quickly. Then there's QC & maybe Nvidia that could compete with them on desktops in a year or two, the window to eff so many things is also insanely large.
 
This is not the kind of chip that needs a lot of cache, so the density will be very close to 4nm TSMC. Also, the data on 6nm is terribly wrong.
Just to illustrate how flawed the data in the article is:

7600 (6nm, 32MB cache) = 65MTr/mm²
4060 Ti (5nm custom, 32MB cache) = 121.8MTr/mm²
The transistor density of the 6nm client I/O die from "Raphael/Zen5" is approximately 27.87mTr/mm²

Samsung's 4nm process is already in its third or fourth iteration and is now very close to the competition in both yields and density. I see this as a positive move; Samsung is likely offering a significant discount, and AMD will have the opportunity to address one of Zen 5's biggest shortcomings, the IOD and Memory controller.
 
I wonder why amd doesn't hire the bunch of randoms experts here, to tell its engineers why they shouldn't use samsung's node on its next io die.

It's quite obvious both companies have a partnership and this move would end up benefitting both, whilst freeing up capacity at tsmc for the more advanced nodes.
 
Last edited:
AMD could also save tons of (marketing) money by not making so many BS claims & having eggs on their collective faces, eggs which they can ill afford right now :rolleyes:

Maybe the crowd here isn't as bad as you think!
 
The main goal of having a separate die for the I/O section is to save on cost for things that don't scale well with newer nodes. If AMD can get a new design for their I/O die and also produce it for cheaper, i see it as a win for them.

As for the memory support, the performance and all that, the key point would probably be if they use the new fanout interconnection of Halo or what they use on RDNA3 or if they continue to use Infinity fabrics + SerDes like the current lineup. Just moving to the new fanout thing will remove the latency of serializing and deserialising the data over the infinity fabrics.

But in reality, as much as the hardware enthusiast in me would like to have the best I/O die possible that would scale to infinity and all that, in reality, most of the people probably overstate the importance of the I/O die for real world performance. The "Crappy" Zen 4/5 I/O die still is in the best desktop CPU in the market.

Doing a design is always about tradeoff. You can put a bit more of this and that, but that will bring more of this and that. You win when you are able to get the best compromise. Performance is a thing, but production cost is another. And there plenty of other tradeoff to consider.

We will see but i am not worried. Still intriged about the CCD. I think that is what really matter anyway. Having a good core is crucial.
 
Another thing to consider: the future of Samsung foundry and Intel foundry are both uncertain. If both fold, TSMC will have a monopoly. That's bad for AMD. Using Samsung to build an I/O die on an "inferior" Samsung node will come with almost no downsides and it's a small desposit toward keeping a second foundry option available in the future.
 
All I want is more than 24 PCIe lanes. If AMD gives us that in the consumer space, they can fab their IOD from moon dust for all I care. You democratised cores with Zen AMD, now democratise PCIe lanes with Zen 6.
Zen6 IOD will most likely have standard upgrades:

1. new, faster Infinity Fabric PHY. It will be either classic upgrade to 'GMI4' or a new solution used on Strix Halo die
2. maybe more PCIe lanes, hopefully one extra x4, in total 32 lanes, just like IOD on Zen2/3 had (not all lines were enabled)
3. new and faster IMC, possibly ~6800/7200 MT/s
4. new iGPU with either 2CUs or 4CUs; RDNA4?
5. new VCN media engine with better encoders and codec support
6. new DCN display engine; native UHBR20 support for DP80 Gbps
7. integrated USB4 PHY (like in Strix Point) to free up x4 link for another SSD (Intel integrated two TB4 ports into Arrow Lake silicon)
8. NPU is unlikely on desktop IOD, but who knows

So, the best case scenario for your PCIe needs might by extra 4 or 8 lanes: new x4 PHY and another x4 freed by integrating USB4 on IOD. Quite decent.
They are not going to add more before transitioning to AM6, as 1718 socket is already pretty crowded.

It is motherboard vendors that can distribute current lanes more smartly by using PCIe switch chips. x16 Gen5 lanes on GPU can be easily split into x8 on primary, x4 on secondary and x4 for another Gen5 SSD. This solution is available on X870E ProArt Creator from Asus. There is another solution that allows you to enable or disable USB4 in BIOS, allowing another x4 slot for SSD. Those are temporary solutions, but it's good to be aware of options for those who need it.
 
Last edited:
Here's a good one: fixed function and analog logic gates and their inability to benefit greatly from process density scaling.
Exactly this
 
Zen6 IOD will most likely have standard upgrades:
3. new and faster IMC, possibly ~6800/7200 MT/s

Should be 7500/8000 MT/s as lowest.

4. new iGPU with either 2CUs or 4CUs; RDNA4?
5. new VCN media engine with better encoders and codec support
6. new DCN display engine; native UHBR20 support for DP80 Gbps

Won't work. You need 8 or 16 CUs with the new architecture post-RDNA (a.k.a. "RDNA 5"). And a new VCN/DCN with 16K@120Hz support.
 
AMD and skimping on IO die. Name a better duo.
For consumer chips you mean? Because AMD Turin's IO die on TSMC N4 does 12 channels all populated with registered DIMMs running at 6400Mhz. It's probably the greatest IO die in the world.

Their consumer IO's comparatively suck and it's old by now. But the focus shouldn't be on density or what nm it's fabbed in, the analog circuitry and fixed function units doesn't scale down well at all. Performance wise, pretty sure there's not much difference going from TSMC N6 to N4 for an IO die. It's just the nature of these chips. Sure the GPU is in there as well, but it's tiny and inconsequential in these desktop chips.

They were never going to use TSMC 4nm for IO dies, it's a colossal waste and they need N4 fab capacity for instincts. But yeah, since the performance of an IO die doesn't really depend much on the process node, this shouldn't be much of an issue if they design it right. Turin gives some hope. Pretty sure they did performance comparisons between Samsung 4 and TSM 6, and (hopefully) chose the better option performance wise.
 
Should be 7500/8000 MT/s as lowest.
Not at all. Official support will not be that high. This never happens. With EXPO, a sweet spot might be towards ~8000.
Won't work. You need 8 or 16 CUs with the new architecture post-RDNA (a.k.a. "RDNA 5").
This is unlikely to happen, unless they want want to scrap the lowest tier GPUs and give desktop users iGPU for basic 1080p gaming. In all other cases, desktop iGPU does not need as many CUs. G SKUs will have a powerful iGPU, X CPUs more basic.
And a new VCN/DCN with 16K@120Hz support.
I said UHBR20. You reasearch and learn what that means.
 
Last edited:
Not at all. Official support will not be that high. This never happens. With EXPO, a sweet spot might be towards ~8000.

You research the AMD DDR5 support in 2024. I will give you a hint:

1739572973757.png



This is unlikely to happen, unless they want want to scrap the lowest tier GPUs and give desktop users iGPU for basic 1080p gaming. In all other cases, desktop iGPU does not need as many CUs. G SKUs will have a powerful iGPU, X CPUs more basic.

That's the goal. AMD needs to become competitive. Innovation and better products are the way to achieve this.
 
Zen 6 IOD is supposed to be Ryzen beast mode.

Main reason why I grabbed a cheap AM5 board while I could :)
 
You research the AMD DDR5 support in 2024. I will give you a hint:
You don't need to give me any hint. Strix APUs support LPDDR5X up to 8000 MT/s, but this is in a different mobility package.
Official support on Zen5 desktop is 5600 and EXPO sweet spot is 6000. EPYCs are validated for 6000. Nothing more to say about this.
AMD needs to become competitive. Innovation and better products are the way to achieve this.
They are already way more competitive than before.

Here is an approximation of alleged Zen6 IOD and brand new 12-core chiplets from a fresh leak.
It's said Zen6 is 100% on AM5.

Medusa Ridge
Medusa Point
AMD Zen6 Medusa Ridge.png
AMD Zen6 Medusa Point.png

 
Last edited:
Just to illustrate how flawed the data in the article is:

7600 (6nm, 32MB cache) = 65MTr/mm²
4060 Ti (5nm custom, 32MB cache) = 121.8MTr/mm²
The transistor density of the 6nm client I/O die from "Raphael/Zen5" is approximately 27.87mTr/mm²
Density is not a fixed metric, the one in the article is an approximate when used with a high density library, aka best case scenario.
GPUs use High Performance library which trade density for performance.

Since N6 is usually used for budget chips, with a quick search I couldn't find smartphone chip with a known transistor count and die size, so I'll use N7 instead.

Apple A12 = TSMC N7 = 6.9B transistors @ 83.27mm² = 82.9 MTr/mm².
TSMC says that N6 is ~18% more dense than N7, so the value in the article is not unrealistic.
 
Density is not a fixed metric, the one in the article is an approximate when used with a high density library, aka best case scenario.
GPUs use High Performance library which trade density for performance.

Since N6 is usually used for budget chips, with a quick search I couldn't find smartphone chip with a known transistor count and die size, so I'll use N7 instead.

Apple A12 = TSMC N7 = 6.9B transistors @ 83.27mm² = 82.9 MTr/mm².
TSMC says that N6 is ~18% more dense than N7, so the value in the article is not unrealistic.
As I exemplified above with the comparison of comparable chips in terms of logic and cache, the data is way out of line in this context.

We know that Samsung's Exynos 2400 (4nm LPP+) is very similar in size to its competitor SD8Gen2, which uses N4P TSMC.
 
TSMC 6nm is not used for any existing hardware anymore, the last produced hardware is part of RX7000 and Intel's Alchemist, but both are no longer manufactured
Well, they can't just stay at N6 can they? That's the point. What's next to N6?
 
Back
Top