Monday, March 1st 2021

AMD "Zen 4" Microarchitecture to Support AVX-512

The next-generation "Zen 4" CPU microarchitecture powering AMD's 4th Gen EPYC "Genoa" enterprise processors, will support 512-bit AVX instruction sets, according to an alleged company slide leaked to the web on the ChipHell forums. The slide references "AVX3-512" support in addition to BFloat16 and "other ISA extensions." This would make "Zen 4" the first AMD microarchitecture to support AVX-512. It remains to be seen which specific instructions the architecture supports, and whether all of them are available to both the enterprise and client implementations of "Zen 4," or whether AMD would take an approach similar to Intel, in only enabling certain "relevant" instructions on the client parts. The slide also mentions core counts being "greater than 64" corresponding withour story from earlier today.
Sources: ChipHell Forums, via VideoCardz
Add your own comment

43 Comments on AMD "Zen 4" Microarchitecture to Support AVX-512

#1
Vya Domus
Like I said many times, wide SIMD is kind of stupid in CPUs, I just hope AVX512 doesn't ruin the power consumption of these chips.
Posted on Reply
#2
P4-630
Would be cool if it had all core overclocking possibilities :D :D

64 cores over 4 GHz...:pimp:
Posted on Reply
#3
watzupken
It seems that AMD may be slowly encroaching into areas where Intel likes to toot their horn. Currently, Intel have a significant advantage with AVX 512 load, and also in the area of AI. So I will not be surprise AMD will start attacking these "strongholds". But I do feel AVX 512 is quite unlikely to be used by most folks.
Posted on Reply
#4
tabascosauz
Part of me hopes that AVX-512 will come to Zen 4 to prove that Zen isn't "behind", but stay on the EPYC platform. Clocks remain low on EPYC, which is perfect for avoiding the increasing clockspeed penalties associated with every successive generation of AVX.

It's a good feature family - it's just not conceptually aligned with Intel and AMD's perpetual pursuit of high clock speeds on consumer mobile/desktop platforms. Hell, look at AVX2 downclocking on Ryzen 3000/5000 stock boost algorithms. You can have your AVX - but you'd better be prepared to drop those clocks lest you want to double your power draw or burn your chip, and that applies to both Intel and AMD.
Posted on Reply
#5
voltage
lmao. catching up to Intel. it sure took amd long enough. DECADES in fact, with of course the Iranian bail out (massive stock buy in when it was 3 dollars per share) to save them.
Posted on Reply
#6
DeathtoGnomes
Vya DomusLike I said many times, wide SIMD is kind of stupid in CPUs, I just hope AVX512 doesn't ruin the power consumption of these chips.
It will but it wont be a drastic amount
Posted on Reply
#7
stimpy88
Ahhh, poor Intel, the nasty AMD took away your one and only benchmark winner...
voltagelmao. catching up to Intel. it sure took amd long enough. DECADES in fact, with of course the Iranian bail out (massive stock buy in when it was 3 dollars per share) to save them.
Salty? The fact that they have already overtaken Intel in many ways, combined with their ongoing momentum, Intel is the underdog now.
Posted on Reply
#8
londiste
I have to wonder why is it AVX3-512, is AMD also implementing a subset of the operations or perhaps their own different set?
Pretty likely that they implement it the same way they did AVX2 at first - using two narrower units for the actual execution part.
Posted on Reply
#9
the54thvoid
voltagelmao. catching up to Intel. it sure took amd long enough. DECADES in fact, with of course the Iranian bail out (massive stock buy in when it was 3 dollars per share) to save them.
Was that not Abu Dhabi in 2007? Irelevant anyway.
Posted on Reply
#10
Aquinus
Resident Wat-man
voltagelmao. catching up to Intel. it sure took amd long enough. DECADES in fact, with of course the Iranian bail out (massive stock buy in when it was 3 dollars per share) to save them.
When buying a CPU, AVX-512 isn't very high on my list of things that I want. Also, it wasn't the Iranians, it was the UAE.
Posted on Reply
#11
qcmadness
Without knowing the throughput, I don't know whether this is competitive at all.
Posted on Reply
#12
Max(IT)
What are AVX512 good for ? I mean, which real world applications?
Posted on Reply
#13
billEST
you NEED this !!!! price + 25 %

in some time each amd ryzen 5 will cost as XEON processor
Posted on Reply
#14
TheinsanegamerN
Cool, but what consumer applications actually USE AVX 512. Actually, what consumer applications use AVX to begin with? Games certianly dont, they use SSE4 at most. I guess there are some production applications? But it doesnt seem to help at all given AMD crushes intel in every production market out there.

It'd be nice to see wider adoption. These instruction sets are great but seemingly nobody uses them....
Posted on Reply
#15
Selaya
londisteI have to wonder why is it AVX3-512, is AMD also implementing a subset of the operations or perhaps their own different set?
[ ... ]
Possibly alluding to the fact that there is AVX(1), then AVX2 (256); AVX-512 would be AVX3 in that case.
Posted on Reply
#16
kapone32
Does AMD move too fast? I know that B550 was a turtle but why don't they let their market enjoy the current crop and save the new for next year (2022). Work on supplying chips to more vendors of your current crop. I like the Lenovo initiative and the return of older "series" cards, but we need more. I know you have to keep the competition on the back foot but AVX512 is nice for Streamers that use certain programs but not a reason (for me) to get excited.
Posted on Reply
#17
BorisDG
TheinsanegamerNCool, but what consumer applications actually USE AVX 512. Actually, what consumer applications use AVX to begin with? Games certianly dont, they use SSE4 at most. I guess there are some production applications? But it doesnt seem to help at all given AMD crushes intel in every production market out there.

It'd be nice to see wider adoption. These instruction sets are great but seemingly nobody uses them....
Doesn't few games using AVX? I remember some of the Codemasters games having different executables. Also Cyberpunk's AVX was disabled I think (at the moment), because of some crashing issues on the last-gen consoles.
Posted on Reply
#18
TheLostSwede
BorisDGDoesn't few games using AVX? I remember some of the Codemasters games having different executables. Also Cyberpunk's AVX was disabled I think (at the moment), because of some crashing issues on the last-gen consoles.
AVX ≠ AVX512.
AMD supports AVX and AVX2, in fact, even VIA does...
en.wikipedia.org/wiki/Advanced_Vector_Extensions
Posted on Reply
#19
Makaveli
BorisDGDoesn't few games using AVX? I remember some of the Codemasters games having different executables. Also Cyberpunk's AVX was disabled I think (at the moment), because of some crashing issues on the last-gen consoles.
Yes there are games now that use AVX.

Some examples
Death Stranding and Horizon, crew 2, GRID 2, path of exile and project cars

And here is a list of software that does, that includes Microsoft Teams (AVX2) which I use everyday for work.

en.wikipedia.org/wiki/Advanced_Vector_Extensions#Software
Posted on Reply
#20
windwhirl
TheinsanegamerNCool, but what consumer applications actually USE AVX 512. Actually, what consumer applications use AVX to begin with? Games certianly dont, they use SSE4 at most. I guess there are some production applications? But it doesnt seem to help at all given AMD crushes intel in every production market out there.

It'd be nice to see wider adoption. These instruction sets are great but seemingly nobody uses them....
The only mention I've seen of AVX-512 outside of super-computing/scientific applications/enterprise has been in RPCS3 (the PS3 emulator). They support AVX-512 and have implemented a couple of things with it, but not much more, and for now it seems to be limited to the Ice Lake feature set of AVX-512.
github.com/RPCS3/rpcs3/pull/8700
github.com/RPCS3/rpcs3/pull/8712
kapone32Does AMD move too fast? I know that B550 was a turtle but why don't they let their market enjoy the current crop and save the new for next year (2022). Work on supplying chips to more vendors of your current crop. I like the Lenovo initiative and the return of older "series" cards, but we need more. I know you have to keep the competition on the back foot but AVX512 is nice for Streamers that use certain programs but not a reason (for me) to get excited.
It's the same CPU arch that goes for both enterprise and consumer products. I think this announcement is more interesting for enterprise users in general.
Posted on Reply
#21
AlB80
I hope they didn't put 512bit vector ALU.
Posted on Reply
#23
Vya Domus
BorisDGHe talked also just for AVX. You can see it. I know AVX is not AVX512. No need to inform me.
Technically everything implemented in previous versions of AVX can easily be rewritten to use AVX512. The main problem is that you're getting diminishing returns.
Posted on Reply
#24
Kohl Baas
TheinsanegamerNCool, but what consumer applications actually USE AVX 512. Actually, what consumer applications use AVX to begin with? Games certianly dont, they use SSE4 at most. I guess there are some production applications? But it doesnt seem to help at all given AMD crushes intel in every production market out there.

It'd be nice to see wider adoption. These instruction sets are great but seemingly nobody uses them....
My friend and I were unable to try StarCitizen a week ago, because his CPU is lacking AVX which the game needs... Not a big deal, every CPU since 2012 has it, but he uses a first gen i7 which unfortunately do not...
Posted on Reply
#25
dragontamer5788
Vya DomusLike I said many times, wide SIMD is kind of stupid in CPUs, I just hope AVX512 doesn't ruin the power consumption of these chips.
It no longer has a major power-effect on Intel Rocket Lake chips, and AVX512 doesn't have any power-consumption increases on the Centaur CNS (an x86 with AVX512).

At a minimum, AVX512 allows 256-bit cores to get issued 2-uops per instruction (doubling your throughput of the decoder, which is beginning to look like a problem!! Remember: Apple M1 is 8-instructions / clock tick, and AMD Zen is only 4-instructions/clock when decoding, 6-when in the uop cache). More "work" per instruction, so to speak, which was the design of the original Crays from the 1970s.

Intel is going with a native 512-bit implementation, but Centaur CNS (and probably AMD) are probably going to stick with 256-bit native, with 512-bit instructions. This grossly reduces power in the decoder, allows more instructions to fit in L1 cache (because it'd normally take two AVX256-bit instructions to make a 512-bit operation. Or... 1x 512-bit instruction to do 2x256-bit native work). Honestly, there's just a ton of advantages to supporting 512-bit, especially when you consider all the possible designs AMD can do here. There's really no reasons NOT to support 512-bit.
Posted on Reply
Add your own comment
Copyright © 2004-2021 www.techpowerup.com. All rights reserved.
All trademarks used are properties of their respective owners.