Thursday, May 7th 2009

AMD SSE5 Gets an Instruction-Set Expansion, Coins XOP (eXtended Operations)

AMD kept up with the SIMD processing standards Intel set by licensing its popular CPU instruction sets such as MMX, SSE, SSE2, and SSE3. The three were used as is by AMD, except for that AMD chose not to conform completely with Supplemental SSE3, SSE4 and its revisions (SSE4.1, SSE4.2). The company devised the SSE4A instruction set to feature with its K10 micro-architecture. SSE4A is a lighter version that features LZCNT (Leading Zero Count), POPCNT (bit population count), EXTRQ/INSERTQ and MOVNTSD/MOVNTSS (Scalar streaming store instructions). What's more, the company even decided back in 2007 that it would come up with SSE5, that then Intel sought to leave development with AMD.

In due course of time, Intel started development of AVX (Advanced Vector eXtensions) that enhances processing of FPU-intensive workloads. AMD gained interest in this technology, and is looking to make it compatible with the originally-conceived SSE5. The instructions that remain as part of the superset that doesn't include AVX is now referred to by AMD as XOP (eXtended OPerations). In addition to this, AMD will include FMA4 (Floating point vector Multiply-Accumulate). The new instruction sets make it to AMD's next-generation Bulldozer micro-architecture slated for 2011. Meanwhile, Intel's AVX makes it to the Sandy Bridge micro-architecture slated for 2010~11. AMD published the Programmer’s Manual document on 128-Bit and 256-Bit XOP, FMA4 and CVT16 Instructions, which can be read here (PDF).
Add your own comment

12 Comments on AMD SSE5 Gets an Instruction-Set Expansion, Coins XOP (eXtended Operations)

#1
Imsochobo
sooo, there is alot happening.

Jeez, many "new" terms here, havnt seen those companies use those in years(actually mentioning them)

Like now is sse4 supported, and that was it, wonder what all this gives us, probaly just to wait and see :)
Posted on Reply
#2
btarunr
Editor & Senior Moderator
Wikipedia just got updated. Here are some quick references for XOP, AVX (article has been there for a while), and AMD SSE5.
Posted on Reply
#3
KieranD
this is heavy reading, confusing if you dont read over it properly

its the instruction sets being updated, amd and intel used to share them but split thats what i read anyway
Posted on Reply
#4
btarunr
Editor & Senior Moderator
by: KieranD
amd and intel used to share them but split thats what i read anyway
the other way.

- SSE5 was conceptualized as a standard for both Intel and AMD (circa 2007).
- Intel came up with AVX in/since 2008, and broke away from the SIMD design plan. AVX and the original SSE5 are mutually incompatible
- AMD included AVX in its set and made it compatible with SSE5 (May 2009)
- AMD-exclusive instructions referred to as XOP
Posted on Reply
#5
a111087
by: btarunr
and AMD SSE5.
good read, they did say that Bulsozer will be good with even single threaded apps
Posted on Reply
#6
KieranD
okay so intel broke off? i know that it was supposed to be a standard of sorts
Posted on Reply
#7
Valdez
....such as MMX, SSE, SSE2, and SSE3.

and


Supplemental SSE3, not Supplimentary SSE3


:)
Posted on Reply
#8
OnBoard
by: KieranD
this is heavy reading, confusing if you dont read over it properly
Got my head spinning :)

They should just make SSEx that includes all previous SSE instructions. The list is getting silly long with new pricessors on what they support.
Posted on Reply
#9
LittleLizard
lets hope that that XOP make things run really fast
Posted on Reply
#11
mamisano
by: OnBoard
Got my head spinning :)

They should just make SSEx that includes all previous SSE instructions. The list is getting silly long with new pricessors on what they support.
AMD had done that, in essence creating something called SSEPlus. The open source project allows developers to code once using SSEPlus. Basically SSEPlus will determine if a CPU supports a given SSE instruction or not. If it does, the instruction is called normally, if it doesn't the program will emulate the SSE instructions.
  • Developers no longer have to redevelop their algorithms to write for multiple SSE revisions
  • Simplified CPUID checking
  • Simplified maintenance of code that targets different SSE instruction mixes
  • SSEPlus provides containers to hold instructions that are desirable in hardware (e.g., 32 bit integer divide)
  • Helps developers use and implement instructions that match their own algorithms
  • Optimize code once for target hardware while at the same time ensuring that generated code conforms to the target hardware
http://developer.amd.com/cpu/Libraries/sseplus/Pages/default.aspxhttp://sseplus.sourceforge.net/
Posted on Reply
#12
WarEagleAU
Bird of Prey
Very nice, go AMD. Was wondering why they didnt really go with the SSE4.
Posted on Reply
Add your own comment