Sunday, October 14th 2018

AMD "Zen" Does Support FMA4, Just Not Exposed

With its "Zen" CPU microarchitecture, AMD removed support for the FMA4 instruction-set, on paper. This, while retaining FMA3. Level1Techs discovered that "Zen" CPUs do support FMA4 instructions, even through the instruction-set is not exposed to the operating system. FMA, or fused multiply add, is an efficient way to compute linear algebra. FMA3 and FMA4 are not generations of the instruction-set (unlike SSE3 and SSE4), but rather the digit denotes the number of operands per instruction. Support for both were introduced by AMD in 2012 with its FX-series processors, while Intel added FMA3 support in 2013 with "Haswell."

The exact reasons why AMD deprecated FMA4 with "Zen" are unknown, but some developers speculate it's because AMD's implementation of FMA4 is buggy, even though it's more efficient (33% more throughput). Intel's adoption of FMA3 made it more popular, and hence more stable over the years. Level1Techs used an OpenBLAS FMA4 test-program to confirm that feeding "Zen" processors with FMA4 instructions won't just return a "illegal instruction" error, but also the processor will go ahead and complete the operation. This is interesting because FMA4 isn't exposed as a CPUID bit, and the operating system has no idea the processor even supports the instruction. For linear algebra, FMA4 has proven more efficient than AVX in both single- and double-precision.
Show 10 Comments