Thursday, October 19th 2017

Intel "Cannon Lake" Could Bring AVX-512 Instruction-set to the Mainstream

Intel's next-generation "Cannon Lake" CPU micro-architecture could see the introduction of the AVX-512 instruction-set to the mainstream segments (MSDT or mainstream-desktop, and mobile). It is currently available on the company's Core X "Skylake-X" HEDT processors, and on the company's Xeon "Skylake-W," Xeon Scalable "Skylake-SP," and in a limited form on the Xeon Phi Knights Landing and Knights Mill scalar compute chips.

The upcoming "Cannon Lake" mainstream silicon will feature AVX512F, AVX512CD, AVX512DQ, AVX512BW, and AVX512VL instructions, and will support AVX512_IFMA and AVX512_VBMI commands, making it a slightly broader implementation of AVX-512 than the "Skylake-SP" silicon. The new AVX-512 will vastly improve performance of compute-intensive applications that take advantage of it. It will also be a key component of future security standards.
Source: Anandtech
Add your own comment

52 Comments on Intel "Cannon Lake" Could Bring AVX-512 Instruction-set to the Mainstream

#26
TheGuruStud
EarthDog
Theres aways something better around the corner. Take this advice and nobody will ever buy.
You mean intel sidegrades are always around the corner :D
Posted on Reply
#27
bug
TheGuruStud
You mean intel sidegrades are always around the corner :D
Not taking the bait, but a refined Zen that supposedly closes the clock speed gap is also expected in Q1 18. Followed by Zen 2 probably in Q4.
Posted on Reply
#28
efikkan
There is no point in postponing a purchase for a Ryzen with 200 MHz extra.
Posted on Reply
#29
OSdevr
efikkan
What I find most interesting is the "Fast Short REP MOV". Those of you with experience with assembly, knows a CPU spends a lot of cycles not only moving data from memory to CPU registers, but also shuffling around the registers to be able to execute the next ALU or FPU instruction. A single ALU/FPU operation may require up to 3-4 MOV operations. It may seem very wasteful to spend clock cycles just moving a few bits instead of spending them calculating stuff, so anything which helps reduce these "wasteful" operations will help throughput without increasing computational resources.
REP MOV is for string and memory copying/zeroing, it's not useful for moving registers around. MOV reg, reg is also perhaps the cheapest operation an x86 CPU can perform, I think it's done entirely through register renaming nowadays (though it still takes some fraction of a cycle on average). I will agree though that all those MOVs take up extra space and can be annoying when programming in assembly :)

I wonder if Intel has done this by adding a basic DMA engine inside the core. That would be very significant.
Posted on Reply
#30
Vya Domus
Camm
AVX512 - where either the vector unit runs, or your CPU runs (as thermally, the vector unit throttles the shit out of the CPU).

Intel needs to solve that before I'll get excited about AVX512 (as lets be honest, its only generally useful for the 1% of stuff I can't send to the GPU in the first place).
There is nothing to be fixed , it's a disadvantage that cannot be avoided when you want to use very wide SIMD instructions on a high clocked x86 CPU.
Posted on Reply
#31
Prima.Vera
EarthDog
Theres aways something better around the corner. Take this advice and nobody will ever buy.
Sorry, but for the CPUs especially, this is not valid. From 3K series up to the 7K series (so 4 generations) I see and there's absolutely no valid or rational reason to upgrade the CPU, and mobo/ram because of this. The performance increase just isn't there. Even those new, 6 cores CPUs are not worthy at all, considering their underwhelming gaming performance. Next year, the new 8 Cores might bring something new to the table (one can only hope), but even so, I am not very optimistic about it.
Posted on Reply
#32
bug
Prima.Vera
Sorry, but for the CPUs especially, this is not valid. From 3K series up to the 7K series (so 4 generations) I see and there's absolutely no valid or rational reason to upgrade the CPU, and mobo/ram because of this. The performance increase just isn't there. Even those new, 6 cores CPUs are not worthy at all, considering their underwhelming gaming performance. Next year, the new 8 Cores might bring something new to the table (one can only hope), but even so, I am not very optimistic about it.
USB 3.1, Thunderbolt, M.2, NVMe are all reasons to upgrade. Yes, it's a little messed up when the reason for upgrade is not the CPU, but it is what it is.
Also, do I need to remind you that back in Sandy Bridge or Ivy Bridge days we were lucky to get 3 hours worth of battery life out of our laptops?
I'm not saying desktop CPUs haven't stood still (because clearly they have), but that doesn't mean progress didn't happen in other areas.
Posted on Reply
#33
EarthDog
Prima.Vera
Sorry, but for the CPUs especially, this is not valid. From 3K series up to the 7K series (so 4 generations) I see and there's absolutely no valid or rational reason to upgrade the CPU, and mobo/ram because of this. The performance increase just isn't there. Even those new, 6 cores CPUs are not worthy at all, considering their underwhelming gaming performance. Next year, the new 8 Cores might bring something new to the table (one can only hope), but even so, I am not very optimistic about it.
Its plenty valid. If the users doesnt need avx 512 at this level, there is no point to wait...especially for when CL is supposed to drop... ;)

Also, so few people NEED 8 cores.. 4/6 is plenty and will be for years to come. Waiting for more cores is less of a valid and rational reason to me.
Posted on Reply
#34
Vya Domus
EarthDog
Also, so few people NEED 8 cores.. 4/6 is plenty and will be for years to come.
I couldn't disagree more. Don't fool yourself into thinking it's "enough" because it's not. The need is there without doubt , it's just that quad cores are all we get for the mainstream market and unfortunately the software is tailored for that. It's a case of "you don't have a choice" rather than "it's enough".
Posted on Reply
#35
EarthDog
Its enough until there is a choice or the user already has a need.

Weve been waiting for a decade so far for multi cores to take over. It hasn't. I'm sure you remember once the q6600 hit the scene that has always been the talk. IMO, its really been since this year, when budget appropriate, its been suggested to go 4c/8t.

Both amd and intel have more than 4c/8t cpus for years...for professionals who use it and drive a signifocant part of the software market. Software needs to catch up, still. Im just not holding my breath its anytime soon. ;)
Posted on Reply
#36
Vya Domus
EarthDog
Its enough until there is a choice or the user already has a need.

Weve been waiting for a decade so far for multi cores to take over. It hasn't. I'm sure you remember once the q6600 hit the scene that has always been the talk. IMO, its really been since this year, when budget appropriate, its been suggested to go 4c/8t.

Both amd and intel have more than 4c/8t cpus for years...for professionals who use it and drive a signifocant part of the software market. Software needs to catch up, still. Im just not holding my breath its anytime soon. ;)
They didn't take over because both Intel and AMD would like to sell you the same product for as long as possible. And people going around saying over and over that it is enough helped them to do that. There is no incentive for software developers to put the effort to make use of hardware that is only available to a single digit percentage of the user base. There are pretty old games such as Crysis 3 which scale even beyond 8 threads proving that software already caught up long ago.

So no , I recommend getting the highest core/thread count CPU you can afford , make yourself a favor and help the industry move out of this vicious circle for Christ sake.
Posted on Reply
#37
EarthDog
Yeah, theres plenty of reasons for software stagnation. Lack of a need, amd and intel, etc...i do agree we will start seeing an increase, but it wont happen overnight, or quickly in general.

But 4c/8t and 6c/12t cpus will be just fine for the vast majorty of the market for the next few years. Buy now.. no need to wait for CL unless avx512 is a need and buying a mainstream cpu to get it.

Edit to your edit: always get the best you can afford. But if it comes down to a gaming machine and a 1070 or a 1080 (or a gpu jump) over a quad with ht or hex with ht, id still go quad with ht and the better gpu. It just depends on the specific use case. Im not pushing people in any direction but for their needs. :)
Posted on Reply
#38
Aquinus
Resident Wat-man
Still trying to catch up to GPU floating point performance, huh? :rolleyes:
Posted on Reply
#39
2901BitSlice
EarthDog
Both amd and intel have more than 4c/8t cpus for years...for professionals who use it and drive a signifocant part of the software market. Software needs to catch up, still. Im just not holding my breath its anytime soon. ;)
Yes but Intel in the past has always been able to up clock speed to gain extra performance. With that dynamic in play Software developers had little incentive to write code for multi cores. But now Intel, as well as the entire Microprocessor Industry is coming to the end of the line. 2018 will bring 12nm for AMD and 10nm for Intel. AMD's 7nm is still a few years away. After that it will get very tough and advances in Silicon will come very slowly. When advances do come they will be very expensive. This will put pressure on the Software Developers to change the way they write code.

Intel killed the Quad Cores when they Paper Released the 8700k. They knew doing so would torpedo the 7700k but they were willing to accept that loss to slow down Ryzen 1600/1600X sales for a quarter or two. Hex Core and above allow the user to multitask with gaming being one component. This was the race that AMD had hoped to provoke when they released Bulldozer many moon ago.

For right now Quad Core CPUs have been reduced to an entry level device and as software development utilizes more and more cores I would expect Quads to fade away. Eight Cores could easily become a standard Consumer Power User configuration in a very short time. AMD Users possess the compatibility factor that Intel doesn't have. So a 1200 User can upgrade to a 1700 or 1800x for the price of the chip and perform that upgrade in a matter of minutes. Drop in the chip, power up your machine, one reboot and your done.
Posted on Reply
#40
EarthDog
@2901BitSlice - Again, plenty of reasons for software stagnation.

Not holding my breath... nobody is going to complain about a 4c/8t part for a couple(2-3) of years. Nobody is going to complain about a 6c/12t part longer(4-5). The statement i originally responded to much earlier in the thread, there is no reason to wait for 8c/16t cannonlake unless they need avx512 on mainstream. Ryzen and Intel have plenty powerful cpus to avoid waiting several months for whatever CL offers.

8c/16t cpus will become 'common' in a couple generations. It's not soon, nor will it be until we see a lot more momentum on the software front. It's going to take time for the current gen cpus to make a significant dent in market share. Software, for the most part, just isn't there (yet), and IMO will take more time than people feel...those two together tell me we aren't moving as fast as many people want and feel.
Posted on Reply
#41
OSdevr
EarthDog
The statememt i originally responded to much earlier in the thread, there is no reason to wait for 8c/16t cannonlake unless they need avx512 on mainstream.
I think people needing AVX512 is quite a bit smaller than those needing extra cores. The only people I think would benefit from it would be those who compile their own software (ex. Gentoo users) because it will be years before most software has it compiled in for you.
Posted on Reply
#42
EarthDog
I agree.


On a side note, the more cores consumers have available the more likely it for software to start taking advantage of it. We are on the cusp of software taking advantage of it, but not there yet, and it will take time. Its not like anything close to done in the pipe can make it so without adding time to live. Software that can easily add it isn't close to being readily available either. Considering how long the adoption rate is going to be... I mean look at steam where 2 cores and 4 cores rule.... by far, and quads have been out for 10 years.... its going to take a few years yet.
Posted on Reply
#43
Aquinus
Resident Wat-man
OSdevr
I think people needing AVX512 is quite a bit smaller than those needing extra cores. The only people I think would benefit from it would be those who compile their own software (ex. Gentoo users) because it will be years before most software has it compiled in for you.
...and the benefit is very application specific. It would really only apply to software doing a lot of floating point math and even more than than, math that doesn't have a dependency on earlier calculations, so instruction-level parallelism and its constraints would apply to the kind of benefit AVX would have. Basically, the only workload that would benefit from this is high volume floating point math designed to stream values through something that does several independent floating point ops in a row. As I said earlier, the people who care about this are likely doing machine learning or statistical analysis if they're not already using a GPU.

AVX is cute because...
Aquinus
Still trying to catch up to GPU floating point performance, huh? :rolleyes:
Posted on Reply
#44
bug
EarthDog
I agree.


On a side note, the more cores consumers have available the more likely it for software to start taking advantage of it. We are on the cusp of software taking advantage of it, but not there yet, and it will take time. Its not like anything close to done in the pipe can make it so without adding time to live. Software that can easily add it isn't close to being readily available either. Considering how long the adoption rate is going to be... I mean look at steam where 2 cores and 4 cores rule.... by far, and quads have been out for 10 years.... its going to take a few years yet.
You people need to stop equating threads and core count. Just take a look at the task manager see how many (hundreds of) threads run on your current CPU.
It's only when all those threads need to be active and crunching at once that more hardware cores are needed. That's easy to do on server where the workload is inherently parallel, but on the desktop the usage pattern is quite different. On the desktop, the slowest element is often between the screen and the chair and all those threads have nothing to do, but wait for input.

That being said, I will keep buying the best CPU for the job today, "futureproofing" be damned. Imagine how I'd kick myself in the nuts now if I bought the first 8-threads CPU that came out, hoping to put it to good use when software actually took advantage of 8 cores.

Also, it goes without saying that certain tasks(rendering, multimedia processing) can and do use more threads than we currently have on the desktop, if you happen to fall into that category, then yes, get as many cores as you can.
Posted on Reply
#45
EarthDog
bug
You people need to stop equating threads and core count. Just take a look at the task manager see how many (hundreds of) threads run on your current CPU.
Not sure that was ever a question here... thanks though?
Posted on Reply
#46
Frick
Fishfaced Nincompoop
TheinsanegamerN
Also going to add- emulation. Emulating more complex game consoles depends on high FPU calculations. PCSX2 got a nice boost from AVX.

Things like AVX-512 will probably be a near requirement for x360/PS3 emulation, if we ever get there.
Transistor level emulation is the only true emulation. :D
Posted on Reply
#47
bug
EarthDog
Not sure that was ever a question here... thanks though?
I was just trying to add to your post, not answer a question. It's the forum's fault ;)
Posted on Reply
#48
cdawall
where the hell are my stars
EarthDog
Its plenty valid. If the users doesnt need avx 512 at this level, there is no point to wait...especially for when CL is supposed to drop... ;)

Also, so few people NEED 8 cores.. 4/6 is plenty and will be for years to come. Waiting for more cores is less of a valid and rational reason to me.
I need my 16 core chip to be a 22 core one. Some of us exist.
Posted on Reply
#49
EarthDog
And here is a cookie... for you hundredth percent of 1%ers... :p
Posted on Reply
#50
cdawall
where the hell are my stars
EarthDog
And here is a cookie... for you hundredth percent of 1%ers... :p
I NEEED it lol. My plex server is getting a bit popular with the family starting to tax that poor little 16 core a bit too much. Might need to bump the ram over 32 as well.
Posted on Reply
Add your own comment