Tuesday, September 22nd 2020

AMD Radeon "Navy Flounder" Features 40CU, 192-bit GDDR6 Memory

AMD uses offbeat codenames such as the "Great Horned Owl," "Sienna Cichlid" and "Navy Flounder" to identify sources of leaks internally. One such upcoming product, codenamed "Navy Flounder," is shaping up to be a possible successor to the RX 5500 XT, the company's 1080p segment-leading product. According to ROCm compute code fished out by stblr on Reddit, this GPU is configured with 40 compute units, a step up from 14 on the RX 5500 XT, and retains a 192-bit wide GDDR6 memory interface.

Assuming the RDNA2 compute unit on next-gen Radeon RX graphics processors has the same number of stream processors per CU, we're looking at 2,560 stream processors for the "Navy Flounder," compared to 80 on "Sienna Cichlid." The 192-bit wide memory interface allows a high degree of segmentation for AMD's product managers for graphics cards under the $250-mark.
Sources: VideoCardz, stblr (Reddit)
Add your own comment

135 Comments on AMD Radeon "Navy Flounder" Features 40CU, 192-bit GDDR6 Memory

#51
Vayra86
You made me change my avatar. This fits just right I think
Posted on Reply
#52
ratirt
Vayra86
You made me change my avatar. This fits just right I think
Add cumin in the pick and it's a TRIO :P
Posted on Reply
#53
-The_Mask-
Valantar
That depends on pricing as well as performance. If this is indeed a chip for upper entry level and lower midrange GPUs, 12GB might simply be too expensive. Even with cheaper, lower clocked RAM chips another 6GB is an additional $50-80 BOM cost. That eats a lot of margin for a <$300 GPU. So in the end, the total amount of memory is also dependent on pricing. It might be that the highest end SKU based off this chip will have 12GB, but 6GB for lower end versions is going to represent a significant savinsg - and likely not hold the card back at all at the types of resolutions it will mostly be used for. Of course there could also be cut-down SKUs with 4GB and 8GB lower down the stack.
The price difference won't be that big, in both cases you need six GDDR6 chips. With 6GB you can use six 8Gb chips, with 12GB you need six 16Gb chips. This also isn't a lower midrange graphics card. The performance should be a bit higher then half of a RTX 3080 and a bit above a RX 5700 XT, that's not really low midrange. I believe that a graphics card with specifications like this is probably not sub 300 dollars, but most likely 300 dollars.
Posted on Reply
#54
Xex360
Camm
AMD's cache designs must be fan fucking tastic to think a non-x, GDDR6, limited width bus card is going to compete against the 3000 stack well.
Radeon 7 has higher bandwidth than even the 3090 (using the "fastest graphic memory"), yet the 5700xt with much less bandwidth is close or beats it.
That being said, AMD could launch something really expensive for prosumers with HBM2, on the plus side they could make fun of the false claims of Micron with their GDDR6X.
Posted on Reply
#55
dragontamer5788
Xex360
Camm
AMD's cache designs must be fan fucking tastic to think a non-x, GDDR6, limited width bus card is going to compete against the 3000 stack well.
Radeon 7 has higher bandwidth than even the 3090 (using the "fastest graphic memory"), yet the 5700xt with much less bandwidth is close or beats it.
While the RDNA L0 / L1 / L2 cache structure is clearly superior to the Vega cache architecture (L1 / L2)... I don't think that's what @Camm is talking about here.

I think Camm is talking about the rumored "cache chip", a L4 128MB or 64MB SRAM cache that's rumored to be on NAVI. I don't know where that rumor came from... but the general gist is that everyone's looking at 192-bit bus or 128-bit bus designs thinking AMD has an "ace in the hole" or something, to compete against 3080 or 3070 chips.

With regards to the L4 cache rumors: a fast SRAM may be good for framebuffer operations, but most operations would be going to VRAM anyway (because 4k or 8k textures are so damn big... as is the vertex data). I can't imagine a 64MB or 128MB cache really helping these intense games. But I'm willing to be proven wrong on the issue. I'll think about the issue more if the "super-large cache" rumors are confirmed.
Posted on Reply
#56
Xex360
dragontamer5788
While the RDNA L0 / L1 / L2 cache structure is clearly superior to the Vega cache architecture (L1 / L2)... I don't think that's what @Camm is talking about here.

I think Camm is talking about the rumored "cache chip", a L4 128MB or 64MB SRAM cache that's rumored to be on NAVI. I don't know where that rumor came from... but the general gist is that everyone's looking at 192-bit bus or 128-bit bus designs thinking AMD has an "ace in the hole" or something, to compete against 3080 or 3070 chips.

With regards to the L4 cache rumors: a fast SRAM may be good for framebuffer operations, but most operations would be going to VRAM anyway (because 4k or 8k textures are so damn big... as is the vertex data). I can't imagine a 64MB or 128MB cache really helping these intense games. But I'm willing to be proven wrong on the issue. I'll think about the issue more if the "super-large cache" rumors are confirmed.
Blind me didn't read cache...
Posted on Reply
#57
BoboOOZ
dragontamer5788
I think Camm is talking about the rumored "cache chip", a L4 128MB or 64MB SRAM cache that's rumored to be on NAVI. I don't know where that rumor came from... but the general gist is that everyone's looking at 192-bit bus or 128-bit bus designs thinking AMD has an "ace in the hole" or something, to compete against 3080 or 3070 chips.
Redgamingtech and Moore's Law is Dead leaked about the cache:


dragontamer5788
With regards to the L4 cache rumors: a fast SRAM may be good for framebuffer operations, but most operations would be going to VRAM anyway (because 4k or 8k textures are so damn big... as is the vertex data). I can't imagine a 64MB or 128MB cache really helping these intense games. But I'm willing to be proven wrong on the issue. I'll think about the issue more if the "super-large cache" rumors are confirmed.
Textures are not required for intensive calculations, what a cache is used for is for storing values that need to be accessed multiple times for multiple calculations.
It might be misinformation from AMD, but it's not completely crazy, especially if you listen to Mark Cerney's diatribe in the second video.
Posted on Reply
#58
Valantar
-The_Mask-
The price difference won't be that big, in both cases you need six GDDR6 chips. With 6GB you can use six 8Gb chips, with 12GB you need six 16Gb chips. This also isn't a lower midrange graphics card. The performance should be a bit higher then half of a RTX 3080 and a bit above a RX 5700 XT, that's not really low midrange. I believe that a graphics card with specifications like this is probably not sub 300 dollars, but most likely 300 dollars.
... double density chips have silicon dice inside that are essentially twice the size. There's no other way of getting 2x the bit capacity, after all. So the pricing difference between two chips or one double density chip is trivial. The only savings is that you don't need to design a new PCB, which I never said they did either.

And, as I said on the previous page, we don't know anything about the actual configurations of retail SKUs here. While I also expect a top-end 40CU card with no power or clock limits to exceed the 5700 XT - it's the same CU count on an updated arch and node, after all - that doesn't tell us anything at all about cut down versions, power targets, clock speeds, etc. And given the massive perf/$ jump we've (finally!) seen with Ampere, pricing for a 5700 XT equivalent necessarily moves downwards by quite a bit. Lower midrange - below $300, hopefully closer to $200 - sounds about right. That would certainly represent a return to the previous norm of performance/$ actually increasing each generation, rather than the stagnation we've seen in later years. So it's entirely possible that AMD chooses to present that 40CU GPU as a successor to the 5600 XT, as their "high fps 1080p gaming" card (despite it being perfectly capable of mid-to-high settings 1440p) and thus make 6GB of VRAM a completely reasonable amount (barring any cache shenaningans, of course).
Posted on Reply
#59
-The_Mask-
Valantar
Lower midrange - below $300, hopefully closer to $200 - sounds about right. That would certainly represent a return to the previous norm of performance/$ actually increasing each generation, rather than the stagnation we've seen in later years.
The performance/dollar of a RX 5700 XT and RTX 3080 is quite similar, it's just a bit better for the RTX 3080. That means you're actually expecting AMD to double the performance/dollar compared to a RX 5700 XT and RTX 3080. That sounds more like a wish then something realistic.
Posted on Reply
#60
sergionography
Valantar
Where are you getting your baseline perf/W number from? The 5700 XT, 5700, 5600 XT or 5500 XT? Depending on which you pick, the results of your calculation will vary wildly - the 5700 XT is around 1070 Ti levels, with the 2080 and 2080 Ti ~17% ahead, while the 5600 XT is 3-4% ahead of all of those. "The efficiency of RDNA" is in other words highly dependent on its implementation.

Where did you get that TDP number from? Has it been published or confirmed anywhere? The only number I've seen is the whole-system power rating from the official spec sheet, which is 350/340W for the whole system (BD/discless). Even assuming that is peak draw numbers including PSU losses, having the SoC only account for 50% of that number seems low.
350w power supply /= TDP
Remember power supplies don't run on 100% efficiency.
also here's a link from techpowerup
www.techpowerup.com/gpu-specs/playstation-5-gpu.c3480

And in case you believe that one isn't confirmed then take a long at this link with ps4 pro
www.techpowerup.com/gpu-specs/playstation-4-pro-gpu.c2876

ps4 pro has 310watt power supply and it's SOC is rated at 150w tdp

as far as rumors go, I been hearing that sony is keeping power usage around the same as last gen, so if anything it could be as low as 150watt, though I think it's more likely to be around 175w mark
Posted on Reply
#61
Xex360
Valantar
And given the massive perf/$ jump we've (finally!) seen with Ampere
Where? The cards don't cost MSRP except in the US, else it costs much higher, plus Ampere is more expensive than Pascal, the 1070 was faster and had more than the 980ti for less than 400$, while now the 3070 is supposedly faster than the 2080ti wouldn't be surprised if it's only in some cases with less memory.
nVidia managed to fool people by comparing prices to Turing, and let's not forget their lies about MSRP.
Posted on Reply
#62
Camm
Coming back to the actual cards themselves, Random twitter user dropped that the cards are back on HBM2e.

[MEDIA=twitter]1308699315804798976[/MEDIA]
Now, a lot of leaks come from random people with dumpster accounts rather than messaging one of the 'leaker' channels, but this one has fuck all providence.

Buuuuuut. There could be some truth to the rumour. Namely, by adjusting your stack size you could theoretically cut down on memory price, whilst still achieving better thermals, latency and even bandwidth. The better thermal argument also might make sense considering how costly those ampere coolers likely are (making a switch to HBM2e possibly BOM neutral or advantageous).
HBM2e is rated up to twelve stack, but I do believe the highest product is still 8 stacks atm.

192 Channel Bit = 6 stack HBM2e, this would give up to 12Gb of memory and 345 Gb/s per stack

256 Channel Bit = 8 Stack HBM2e, giving up to 16Gb and 460 Gb/s per stack.

Now, do you believe this, or that AMD has magical cache technology that lets it compete at 3080 levels with a significantly more narrow bus?
Posted on Reply
#63
BoboOOZ
Camm
Now, do you believe this, or that AMD has magical cache technology that lets it compete at 3080 levels with a significantly more narrow bus?
Magic always wins in my book.

Seriously, I'm pretty sure AMD is playing us big time, and all the leaks are either directly from them or from fooled AIB partners.
In any case, I'm pretty sure they are much smarter than the average forumite, and they know they cannot disrupt the market with 256 bit bus gddr6.
Posted on Reply
#64
Vayra86
Camm
Coming back to the actual cards themselves, Random twitter user dropped that the cards are back on HBM2e.

[MEDIA=twitter]1308699315804798976[/MEDIA]
Now, a lot of leaks come from random people with dumpster accounts rather than messaging one of the 'leaker' channels, but this one has fuck all providence.

Buuuuuut. There could be some truth to the rumour. Namely, by adjusting your stack size you could theoretically cut down on memory price, whilst still achieving better thermals, latency and even bandwidth. The better thermal argument also might make sense considering how costly those ampere coolers likely are (making a switch to HBM2e possibly BOM neutral or advantageous).
HBM2e is rated up to twelve stack, but I do believe the highest product is still 8 stacks atm.

192 Channel Bit = 6 stack HBM2e, this would give up to 12Gb of memory and 345 Gb/s per stack

256 Channel Bit = 8 Stack HBM2e, giving up to 16Gb and 460 Gb/s per stack.

Now, do you believe this, or that AMD has magical cache technology that lets it compete at 3080 levels with a significantly more narrow bus?
If it has a waifu pic you can safely ignore it and move on. This one is textbook, it even has broken english, a cliffhanger for the next tweet, and poses a question it can't answer. Oh my the suspense :)

These days asking a difficult question is enough for attention it seems. In the land of the blind and those with no life... And in the meantime, everyone can just flee into whatever they like to believe best. Wishful thinking...
Posted on Reply
#65
Valantar
-The_Mask-
The performance/dollar of a RX 5700 XT and RTX 3080 is quite similar, it's just a bit better for the RTX 3080. That means you're actually expecting AMD to double the performance/dollar compared to a RX 5700 XT and RTX 3080. That sounds more like a wish then something realistic.
You sound like you're arguing that perf/$ scales linearly across GPU lineups. This has never been the case. Except for the very low end, where value is always terrible, you always get far more bang for your buck in the $150-400 mid-range than anything above. I mean, just look at the data:

Results are similar at other resolutions, though more expensive GPUs "improve" at 4k. As for value increasing as you drop in price: just look at the 5600 XT compared to the 5700 XT in those charts. Same architecture, same generation, same die, yet the cheaper card delivers significantly higher perf/W than the more expensive one.

As for your comparison: you're comparing across generations, so any value comparison is inherently skewed to the point of being invalid. If it weren't the case that you got more bang for your buck in a new generation, something would be very wrong. As it admittedly has been for a couple of generations now, and the 3080 hasn't fixed it either, just brought us back closer to where things should be. It is absolutely to be expected that all future GPUs from both manufacturers at lower points in their product stacks will deliver significantly better perf/$ than the 3080. That's the nature of premium products: you pay more for the privilege of having the best. $700 GPUs have never been a good value proposition.
Xex360
Where? The cards don't cost MSRP except in the US, else it costs much higher, plus Ampere is more expensive than Pascal, the 1070 was faster and had more than the 980ti for less than 400$, while now the 3070 is supposedly faster than the 2080ti wouldn't be surprised if it's only in some cases with less memory.
nVidia managed to fool people by comparing prices to Turing, and let's not forget their lies about MSRP.
We're literally a week post launch. Demand has been crazy, scalpers with bots have run rampant, and everything is sold out. Give it a while to normalize before you comment on "real-world" prices. And while Nvidia's previous "here's the MSRP, here's the somehow premium, but also base-line FE card at a premium price" shenanigans and the near-nonexistence of cards at MSRP, to be fair to them they seem to have stopped doing that this time around.
sergionography
350w power supply /= TDP
Right. As if I ever said that. Maybe actually read my post? Nice straw man you've got there.
sergionography
Remember power supplies don't run on 100% efficiency.
Again: accounted for, if you had bothered to actually read my post.

Let me refresh your memory:
Valantar
Where did you get that TDP number from? Has it been published or confirmed anywhere? The only number I've seen is the whole-system power rating from the official spec sheet, which is 350/340W for the whole system (BD/discless). Even assuming that is peak draw numbers including PSU losses, having the SoC only account for 50% of that number seems low.
What I'm saying here is that the SoC TDP only accounting for 50% of the PSU's rating, which might include PSU losses due to efficiency sounds a bit low. I'm asking you to source a number that you're stating as fact. Remember, you said:
sergionography
And since you mentioned ps5 think of this, ps5 tdp is 175w
No source, no mention that this is speculation or even anything beyond established fact. Which is what I was asking you to provide.
sergionography
also here's a link from techpowerup
www.techpowerup.com/gpu-specs/playstation-5-gpu.c3480
Unsourced, based on rumors and speculation. The page says as much.
sergionography
And in case you believe that one isn't confirmed then take a long at this link with ps4 pro
www.techpowerup.com/gpu-specs/playstation-4-pro-gpu.c2876

ps4 pro has 310watt power supply and it's SOC is rated at 150w tdp
That is at least a correlation, but correlation does not mean that the rumor you are quoting as fact is actually true. This is what you call speculation.
sergionography
as far as rumors go, I been hearing that sony is keeping power usage around the same as last gen, so if anything it could be as low as 150watt, though I think it's more likely to be around 175w mark
And here is the core of the matter: you are repeating rumors and "what you think is likely" as if it is indisputable fact. It is of course entirely possible that the PS5 SoC has a TDP somewhere around 175W - but you don't have any actual proof of this. So please stop repeating rumors as if they are facts. That is a really bad habit.
Posted on Reply
#66
InVasMani
Camm
Coming back to the actual cards themselves, Random twitter user dropped that the cards are back on HBM2e.

[MEDIA=twitter]1308699315804798976[/MEDIA]
Now, a lot of leaks come from random people with dumpster accounts rather than messaging one of the 'leaker' channels, but this one has fuck all providence.

Buuuuuut. There could be some truth to the rumour. Namely, by adjusting your stack size you could theoretically cut down on memory price, whilst still achieving better thermals, latency and even bandwidth. The better thermal argument also might make sense considering how costly those ampere coolers likely are (making a switch to HBM2e possibly BOM neutral or advantageous).
HBM2e is rated up to twelve stack, but I do believe the highest product is still 8 stacks atm.

192 Channel Bit = 6 stack HBM2e, this would give up to 12Gb of memory and 345 Gb/s per stack

256 Channel Bit = 8 Stack HBM2e, giving up to 16Gb and 460 Gb/s per stack.

Now, do you believe this, or that AMD has magical cache technology that lets it compete at 3080 levels with a significantly more narrow bus?
If that's the case and AMD does indeed have a 3-tiers of chips like I theorized is a possibility and say it utilized and segmented them into tiers as well of 192/256/384-bit memory buses. Let's say they did use HBM2e the 384-bit bus version if it used 12 stacks would have 24GB 690Gb/s per stack that wild to even think about. I think that's really a bit overkill on the bandwidth though and not the greatest on cost compromise either. I actually think 192-bit with 3 stacks for 6GB, 256-bit with 6 stacks for 12GB, and 384-bit with 9 stacks for 18GB makes would be more realistic. The lowest end chip even with 6GB wouldn't be bad at all if it was HBM2e based on a 192-bit bus. It would be more cost effective and better on thermals both of which would help keep costs and TDP lower. If the utilized HBM2e I think that makes sense, but if they use GDDR6 or GDDR6X things change obviously in terms of what's practical in to cost, performance, and TDP compromises. Really with HBM they can get away with less density due to the increased bandwidth especially if they also supplimented it with HBCC and some flash storage perhaps.
Posted on Reply
#67
-The_Mask-
Valantar
You sound like you're arguing that perf/$ scales linearly across GPU lineups.
Obvious not, as the RX 5700 XT is literally in your quote. Like I said wishful thinking. There is no reason to expect double the performance/dollar on a similar production process, with similar specs. That's only possible if the RX 5700 XT was really overpriced, if that isn't the case, it just isn't realistic.
Posted on Reply
#68
Caring1
Camm
Random twitter user dropped that the cards are back on HBM2e.
Lol
A Twitter user is on par or lower than a Reddit user.
I like my facts to be real.
Posted on Reply
#69
Aquinus
Resident Wat-man
btarunr
this GPU is configured with 40 compute units,
Sounds like a full power version of the Radeon Pro 5600M, but with GDDR6 instead of HBM2.
Posted on Reply
#70
Xex360
Valantar
We're literally a week post launch. Demand has been crazy, scalpers with bots have run rampant, and everything is sold out. Give it a while to normalize before you comment on "real-world" prices. And while Nvidia's previous "here's the MSRP, here's the somehow premium, but also base-line FE card at a premium price" shenanigans and the near-nonexistence of cards at MSRP, to be fair to them they seem to have stopped doing that this time around.
I'm not talking about official prices around the world, the card costs between 40 to 300$ more in Europe and Asia for no reason.
Posted on Reply
#71
Camm
Caring1
Lol
A Twitter user is on par or lower than a Reddit user.
I like my facts to be real.
Point me to where the facts are when dealing with prerelease products?
Posted on Reply
#72
Valantar
-The_Mask-
Obvious not, as the RX 5700 XT is literally in your quote. Like I said wishful thinking. There is no reason to expect double the performance/dollar on a similar production process, with similar specs. That's only possible if the RX 5700 XT was really overpriced, if that isn't the case, it just isn't realistic.
The 5700 XT is absolutely priced quite high for what you get - just as all GPUs have been for a few years now, so expecting its equivalent in performance to drop very noticeably in price now that there's a new generation incoming with seemingly significant performance increases is entirely reasonable. If the €699 3080 delivers more performance than the $1200 2080 Ti, and the $499 3070 delivers the same or slightly less, perhaps more like the $699 2080 Super, what do you think will happen at $300-400? There will obviously be a notable drop here as well. As large of one? No, simply because the lower priced options were already higher on the perf/$ curve. But if €499 suddenly gets you $699-1200 performance, then $400 obviously can't be half of that. Even $250 delivering half the performance of $500 would be poor value when compared to historical GPU pricing. So it's very clear that any GPU close to the 5700 XT in performance launching this generation needs to be significantly cheaper than it is - otherwise, nobody would buy it.

Besides, as I said, moving down from the 5700 XT already got you significant gains in perf/$, meaning every performance segment needs to shift further downwards. All of these products are competing in the same market, after all.
Xex360
I'm not talking about official prices around the world, the card costs between 40 to 300$ more in Europe and Asia for no reason.
Did my post indicate I was only talking about official prices? As I said: everything is sold out except for scalpers, demand is crazy high, and scalpers are using bots to buy everything they can. Of course prices are artificially inflated. But if we wait a bit for supply to improve, those prices will inevitably come down. You're arguing on the basis of a short-term fluke and making broad, sweeping points based on it. The average price of a 3080 in the future obviously won't be $699 - that's the baseline price, after all - but there will be cards available at and/or close to that price point once supply overtakes demand.
Posted on Reply
#73
-The_Mask-
You should look up the difference in cost compared to a couple of years ago. Last 5 years we already see a quadrupled price per waffer at TMSC.
Posted on Reply
#74
Xex360
Valantar
Did my post indicate I was only talking about official prices? As I said: everything is sold out except for scalpers, demand is crazy high, and scalpers are using bots to buy everything they can. Of course prices are artificially inflated. But if we wait a bit for supply to improve, those prices will inevitably come down. You're arguing on the basis of a short-term fluke and making broad, sweeping points based on it. The average price of a 3080 in the future obviously won't be $699 - that's the baseline price, after all - but there will be cards available at and/or close to that price point once supply overtakes demand
You still didn't understand my point, I'm talking about the MSRP not the current prices due to low offer, the official price is set higher by nVidia for all markets compared to the US and Canada by sometimes stupid margins even in Taiwan, so the price/performance of the card is very poor.
Posted on Reply
#75
Valantar
-The_Mask-
You should look up the difference in cost compared to a couple of years ago. Last 5 years we already see a quadrupled price per waffer at TMSC.
Yes, that is true. At the same time chip densities have increased dramatically, allowing for the same performance level at much smaller die sizes. Besides, more performance also necessitates a bigger die, so it's not like this disproportionately hurts the value of low performance GPUs. The same value scaling should apply regardless of wafer costs, though absolute pricing might of course change due to this.
Posted on Reply
Add your own comment