• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Intel "Cannon Lake" Could Bring AVX-512 Instruction-set to the Mainstream

Here is a screen shot of a leaked table of potential Intel CPUs. This came out of China and there are spelling errors. 'cores/treads' the H got lost.

View attachment 93292
This is not a leak, just someone creating a table of guesses. This is certainly not anything from Intel.

Core configurations are usually decided during tapeout, and clocks and model names closer to launch. Even Intel doesn't know yet what the models will look like.
And I like the socket names; old socket +10 :p

-----

The source from Anandtech is actually quite an interesting read. It also provides some early indications on what Ice Lake will bring, both in terms of new AVX features and other instructions.

What I find most interesting is the "Fast Short REP MOV". Those of you with experience with assembly, knows a CPU spends a lot of cycles not only moving data from memory to CPU registers, but also shuffling around the registers to be able to execute the next ALU or FPU instruction. A single ALU/FPU operation may require up to 3-4 MOV operations. It may seem very wasteful to spend clock cycles just moving a few bits instead of spending them calculating stuff, so anything which helps reduce these "wasteful" operations will help throughput without increasing computational resources.

Additionally Cannon Lake will add support for SHA-NI, which brings acceleration of SHA and MD5. Surely this will bring like a 100× acceleration for such algorithms, but I'm a firm believer that algorithm-specific instructions don't belong in a general purpose CPU. Whether it's algorithms for cryptography or compression, these algorithms keep evolving making acceleration quickly outdated. SHA and MD5 are already outdated in cryptography, so these are surely added just to show some gains in some specific benchmarks for enterprise customers. For general purpose use, this acceleration is mostly a waste of die space and energy consumption. How much of your CPU time is really spent on AES, SHA, MD5, etc? Probably less than 1%, unless you run some kind of web server, which is why I believe these features belong in specialized processors for such workloads. Back in the 80s, Intel made specialized co-processors for math(8087, etc.), I think they should have used this approach for special enterprise features.
 
Last edited:
Theres aways something better around the corner. Take this advice and nobody will ever buy.

You mean intel sidegrades are always around the corner :D
 
You mean intel sidegrades are always around the corner :D
Not taking the bait, but a refined Zen that supposedly closes the clock speed gap is also expected in Q1 18. Followed by Zen 2 probably in Q4.
 
There is no point in postponing a purchase for a Ryzen with 200 MHz extra.
 
What I find most interesting is the "Fast Short REP MOV". Those of you with experience with assembly, knows a CPU spends a lot of cycles not only moving data from memory to CPU registers, but also shuffling around the registers to be able to execute the next ALU or FPU instruction. A single ALU/FPU operation may require up to 3-4 MOV operations. It may seem very wasteful to spend clock cycles just moving a few bits instead of spending them calculating stuff, so anything which helps reduce these "wasteful" operations will help throughput without increasing computational resources.

REP MOV is for string and memory copying/zeroing, it's not useful for moving registers around. MOV reg, reg is also perhaps the cheapest operation an x86 CPU can perform, I think it's done entirely through register renaming nowadays (though it still takes some fraction of a cycle on average). I will agree though that all those MOVs take up extra space and can be annoying when programming in assembly :)

I wonder if Intel has done this by adding a basic DMA engine inside the core. That would be very significant.
 
AVX512 - where either the vector unit runs, or your CPU runs (as thermally, the vector unit throttles the shit out of the CPU).

Intel needs to solve that before I'll get excited about AVX512 (as lets be honest, its only generally useful for the 1% of stuff I can't send to the GPU in the first place).

There is nothing to be fixed , it's a disadvantage that cannot be avoided when you want to use very wide SIMD instructions on a high clocked x86 CPU.
 
Theres aways something better around the corner. Take this advice and nobody will ever buy.
Sorry, but for the CPUs especially, this is not valid. From 3K series up to the 7K series (so 4 generations) I see and there's absolutely no valid or rational reason to upgrade the CPU, and mobo/ram because of this. The performance increase just isn't there. Even those new, 6 cores CPUs are not worthy at all, considering their underwhelming gaming performance. Next year, the new 8 Cores might bring something new to the table (one can only hope), but even so, I am not very optimistic about it.
 
Sorry, but for the CPUs especially, this is not valid. From 3K series up to the 7K series (so 4 generations) I see and there's absolutely no valid or rational reason to upgrade the CPU, and mobo/ram because of this. The performance increase just isn't there. Even those new, 6 cores CPUs are not worthy at all, considering their underwhelming gaming performance. Next year, the new 8 Cores might bring something new to the table (one can only hope), but even so, I am not very optimistic about it.
USB 3.1, Thunderbolt, M.2, NVMe are all reasons to upgrade. Yes, it's a little messed up when the reason for upgrade is not the CPU, but it is what it is.
Also, do I need to remind you that back in Sandy Bridge or Ivy Bridge days we were lucky to get 3 hours worth of battery life out of our laptops?
I'm not saying desktop CPUs haven't stood still (because clearly they have), but that doesn't mean progress didn't happen in other areas.
 
Last edited:
Sorry, but for the CPUs especially, this is not valid. From 3K series up to the 7K series (so 4 generations) I see and there's absolutely no valid or rational reason to upgrade the CPU, and mobo/ram because of this. The performance increase just isn't there. Even those new, 6 cores CPUs are not worthy at all, considering their underwhelming gaming performance. Next year, the new 8 Cores might bring something new to the table (one can only hope), but even so, I am not very optimistic about it.
Its plenty valid. If the users doesnt need avx 512 at this level, there is no point to wait...especially for when CL is supposed to drop... ;)

Also, so few people NEED 8 cores.. 4/6 is plenty and will be for years to come. Waiting for more cores is less of a valid and rational reason to me.
 
Last edited:
Also, so few people NEED 8 cores.. 4/6 is plenty and will be for years to come.

I couldn't disagree more. Don't fool yourself into thinking it's "enough" because it's not. The need is there without doubt , it's just that quad cores are all we get for the mainstream market and unfortunately the software is tailored for that. It's a case of "you don't have a choice" rather than "it's enough".
 
Its enough until there is a choice or the user already has a need.

Weve been waiting for a decade so far for multi cores to take over. It hasn't. I'm sure you remember once the q6600 hit the scene that has always been the talk. IMO, its really been since this year, when budget appropriate, its been suggested to go 4c/8t.

Both amd and intel have more than 4c/8t cpus for years...for professionals who use it and drive a signifocant part of the software market. Software needs to catch up, still. Im just not holding my breath its anytime soon. ;)
 
Its enough until there is a choice or the user already has a need.

Weve been waiting for a decade so far for multi cores to take over. It hasn't. I'm sure you remember once the q6600 hit the scene that has always been the talk. IMO, its really been since this year, when budget appropriate, its been suggested to go 4c/8t.

Both amd and intel have more than 4c/8t cpus for years...for professionals who use it and drive a signifocant part of the software market. Software needs to catch up, still. Im just not holding my breath its anytime soon. ;)

They didn't take over because both Intel and AMD would like to sell you the same product for as long as possible. And people going around saying over and over that it is enough helped them to do that. There is no incentive for software developers to put the effort to make use of hardware that is only available to a single digit percentage of the user base. There are pretty old games such as Crysis 3 which scale even beyond 8 threads proving that software already caught up long ago.

So no , I recommend getting the highest core/thread count CPU you can afford , make yourself a favor and help the industry move out of this vicious circle for Christ sake.
 
Last edited:
Yeah, theres plenty of reasons for software stagnation. Lack of a need, amd and intel, etc...i do agree we will start seeing an increase, but it wont happen overnight, or quickly in general.

But 4c/8t and 6c/12t cpus will be just fine for the vast majorty of the market for the next few years. Buy now.. no need to wait for CL unless avx512 is a need and buying a mainstream cpu to get it.

Edit to your edit: always get the best you can afford. But if it comes down to a gaming machine and a 1070 or a 1080 (or a gpu jump) over a quad with ht or hex with ht, id still go quad with ht and the better gpu. It just depends on the specific use case. Im not pushing people in any direction but for their needs. :)
 
Last edited:
Still trying to catch up to GPU floating point performance, huh? :rolleyes:
 
Both amd and intel have more than 4c/8t cpus for years...for professionals who use it and drive a signifocant part of the software market. Software needs to catch up, still. Im just not holding my breath its anytime soon. ;)

Yes but Intel in the past has always been able to up clock speed to gain extra performance. With that dynamic in play Software developers had little incentive to write code for multi cores. But now Intel, as well as the entire Microprocessor Industry is coming to the end of the line. 2018 will bring 12nm for AMD and 10nm for Intel. AMD's 7nm is still a few years away. After that it will get very tough and advances in Silicon will come very slowly. When advances do come they will be very expensive. This will put pressure on the Software Developers to change the way they write code.

Intel killed the Quad Cores when they Paper Released the 8700k. They knew doing so would torpedo the 7700k but they were willing to accept that loss to slow down Ryzen 1600/1600X sales for a quarter or two. Hex Core and above allow the user to multitask with gaming being one component. This was the race that AMD had hoped to provoke when they released Bulldozer many moon ago.

For right now Quad Core CPUs have been reduced to an entry level device and as software development utilizes more and more cores I would expect Quads to fade away. Eight Cores could easily become a standard Consumer Power User configuration in a very short time. AMD Users possess the compatibility factor that Intel doesn't have. So a 1200 User can upgrade to a 1700 or 1800x for the price of the chip and perform that upgrade in a matter of minutes. Drop in the chip, power up your machine, one reboot and your done.
 
@2901BitSlice - Again, plenty of reasons for software stagnation.

Not holding my breath... nobody is going to complain about a 4c/8t part for a couple(2-3) of years. Nobody is going to complain about a 6c/12t part longer(4-5). The statement i originally responded to much earlier in the thread, there is no reason to wait for 8c/16t cannonlake unless they need avx512 on mainstream. Ryzen and Intel have plenty powerful cpus to avoid waiting several months for whatever CL offers.

8c/16t cpus will become 'common' in a couple generations. It's not soon, nor will it be until we see a lot more momentum on the software front. It's going to take time for the current gen cpus to make a significant dent in market share. Software, for the most part, just isn't there (yet), and IMO will take more time than people feel...those two together tell me we aren't moving as fast as many people want and feel.
 
Last edited:
The statememt i originally responded to much earlier in the thread, there is no reason to wait for 8c/16t cannonlake unless they need avx512 on mainstream.

I think people needing AVX512 is quite a bit smaller than those needing extra cores. The only people I think would benefit from it would be those who compile their own software (ex. Gentoo users) because it will be years before most software has it compiled in for you.
 
I agree.


On a side note, the more cores consumers have available the more likely it for software to start taking advantage of it. We are on the cusp of software taking advantage of it, but not there yet, and it will take time. Its not like anything close to done in the pipe can make it so without adding time to live. Software that can easily add it isn't close to being readily available either. Considering how long the adoption rate is going to be... I mean look at steam where 2 cores and 4 cores rule.... by far, and quads have been out for 10 years.... its going to take a few years yet.
 
I think people needing AVX512 is quite a bit smaller than those needing extra cores. The only people I think would benefit from it would be those who compile their own software (ex. Gentoo users) because it will be years before most software has it compiled in for you.
...and the benefit is very application specific. It would really only apply to software doing a lot of floating point math and even more than than, math that doesn't have a dependency on earlier calculations, so instruction-level parallelism and its constraints would apply to the kind of benefit AVX would have. Basically, the only workload that would benefit from this is high volume floating point math designed to stream values through something that does several independent floating point ops in a row. As I said earlier, the people who care about this are likely doing machine learning or statistical analysis if they're not already using a GPU.

AVX is cute because...
Still trying to catch up to GPU floating point performance, huh? :rolleyes:
 
I agree.


On a side note, the more cores consumers have available the more likely it for software to start taking advantage of it. We are on the cusp of software taking advantage of it, but not there yet, and it will take time. Its not like anything close to done in the pipe can make it so without adding time to live. Software that can easily add it isn't close to being readily available either. Considering how long the adoption rate is going to be... I mean look at steam where 2 cores and 4 cores rule.... by far, and quads have been out for 10 years.... its going to take a few years yet.
You people need to stop equating threads and core count. Just take a look at the task manager see how many (hundreds of) threads run on your current CPU.
It's only when all those threads need to be active and crunching at once that more hardware cores are needed. That's easy to do on server where the workload is inherently parallel, but on the desktop the usage pattern is quite different. On the desktop, the slowest element is often between the screen and the chair and all those threads have nothing to do, but wait for input.

That being said, I will keep buying the best CPU for the job today, "futureproofing" be damned. Imagine how I'd kick myself in the nuts now if I bought the first 8-threads CPU that came out, hoping to put it to good use when software actually took advantage of 8 cores.

Also, it goes without saying that certain tasks(rendering, multimedia processing) can and do use more threads than we currently have on the desktop, if you happen to fall into that category, then yes, get as many cores as you can.
 
You people need to stop equating threads and core count. Just take a look at the task manager see how many (hundreds of) threads run on your current CPU.
Not sure that was ever a question here... thanks though?
 
Also going to add- emulation. Emulating more complex game consoles depends on high FPU calculations. PCSX2 got a nice boost from AVX.

Things like AVX-512 will probably be a near requirement for x360/PS3 emulation, if we ever get there.

Transistor level emulation is the only true emulation. :D
 
Not sure that was ever a question here... thanks though?
I was just trying to add to your post, not answer a question. It's the forum's fault ;)
 
Its plenty valid. If the users doesnt need avx 512 at this level, there is no point to wait...especially for when CL is supposed to drop... ;)

Also, so few people NEED 8 cores.. 4/6 is plenty and will be for years to come. Waiting for more cores is less of a valid and rational reason to me.

I need my 16 core chip to be a 22 core one. Some of us exist.
 
Back
Top