Thursday, June 15th 2017

Intel Announces New Mesh Interconnect For Xeon Scalable, Skylake-X Processors

Intel's "Xeon Scalable" lineup is designed to compete directly with AMD's Naples platform. Naples, a core-laden, high performance server platform that relies deeply on linking multiple core complexes together via AMD's own HyperTransport derived Infinity Fabric Interconnect has given intel some challenges in terms of how to structure its own high-core count family of devices. This has led to a new mesh-based interconnect technology from Intel.
When linking multiple core complexes together, Intel has traditionally relied on its QPI (Quick Path Interconnect) Interconnect, but it has its own limitations. For starters, QPI is a point to point technology, and it is inherently unsuitable for linking cores to cores in a mesh topology as is needed when random cores want to address each other. To work, you'd need to create a QPI link from every core to every other core on the chip, and that would be a waste of resources. Historically, to fill that gap Intel has used a "Ring bus." A ring bus functions similar to a token ring bus if you want to think of it in simple "old networking" terms. Basically, when data is transmitted, it must be passed around the bus like a token going from one speaker to the next. Each clock cycle, the bus can shift the data one way or another, but if you want to go say for example from one core to the farthest away one possible, latency suffers. In other words: This works fine for small dies, but as Intel seeks to create some true monsters, it's no longer enough.

There are some caveats to the above explanation (Intel's ring bus can have more than one "token" at a time in the analogy, and it's bidirectional, meaning data can travel both ways), but it still doesn't change the fact Intel is feeling this designs limits.

Enter the Mesh Interconnect:
Technically, this new Mesh Interconnect tech from Intel was introduced with the Xeon Phi Knight's Landing based products, but those are very exclusive, high end parts even for us mere enthusiast mortals. This launch represents the technology trickling down to a more mainstream market. To explain the tech, you simply have to forget everything you know about the ring bus chips used to use, and realize that now in the simplest chip implementations, each core has a direct "phone line" to every neighboring chip. It's a full mesh topology with all its benefits. In more complex arrangements, the "direct line to every chip" arrangement of course becomes uneconomical again, being it would require a harsh amount of resources for any chip maker to implement on massive chips the likes of which Naples and Xeon Scalable talk about. So instead, we go back to a simpler mesh topology, rows and columns.

Take this graphic, from Xeon Phi, to get an idea of how it works:
Basically, it's a grid with a full XY routing system. This means rather than run around a huge bus-like circle, they can go through a much more compact "cube" topology and save time. In a simple 3x3 cube example, whereas it would take a total of 8 clock cycles on an equivalent ringbus for communications to reach the farthest core, the most it can take on a Mesh Interconnect based core is 4. (-X-X+Y+Y is how the worst case routing would look, if you've been following this in that level of detail).

If a lot of this goes over your head, don't worry, it's pretty technical low level stuff. The bottom line is Intel has an Interconnect that can compete roughly with AMD's Infinity Fabric in its own respect, and given Intel's larger core count on each die, AMD may be wise to keep a watchful eye on that fact.

The Mesh interconnect is scheduled to debut in the much more "mainstream" (if you can call them that) markets of Xeon Scalable and Skylake-X, due to launch soon this year.Source: Tomshardware.com
Add your own comment

24 Comments on Intel Announces New Mesh Interconnect For Xeon Scalable, Skylake-X Processors

#1
TheGuruStud
Intel is really going with the full on Chinese approach to business, eh? :D

Their decisions sure are interesting. They could have done this years ago, but didn't. Now, it's all, "See, we have this stuff, too, look at us."

It must only be a matter of time (new architecture) until they direct copy and start slapping together MCMs, since yields are killing them on these large core counts.
Posted on Reply
#2
cdawall
where the hell are my stars
TheGuruStud said:
Intel is really going with the full on Chinese approach to business, eh? :D

Their decisions sure are interesting. They could have done this years ago, but didn't. Now, it's all, "See, we have this stuff, too, look at us."
More like hey look at all this stuff we have been sitting on for 6 years while AMD played with an awful design.
Posted on Reply
#3
xkm1948
Inside Intel:

Boss: Shit, AMD's new prcessors may screw up our financial report. Any ideas?

PR: more advertising? Pay editors and tech sites to spread FUD for AMD?

Management: Cut some jobs?

Engineers: Give us some resources so we can build better stuff?


Boss: Full on advertising mode PR team; and you can fire more engineers, management team. Let's take some bad Xeons and call them our I9
Posted on Reply
#4
Camm
Doesn't solve their yield problem, but will definitely help with their multi ring bus designs.

Wonder how much of the die is dedicated to cache coherency in this arrangement then? As thats Intel's biggest killer of clocks in high density parts.
Posted on Reply
#5
Chaitanya
One bad product release after another now onto ripping off competitors.

Posted on Reply
#6
Hugh Mungus
Essentially they're making their cpu architecture better, but are now giving amd free bonus support and that means amd will now get even closer to intel! Also, unless intel has connected mesh speed to ram speed like amd, there is potentially less need for fast ram and low latency, possibly giving amd the edge in the memory department, which would give amd even more chance of catching up and cashing in!

WTF intel?!?!?!?!?!?!
Posted on Reply
#7
PowerPC
Intel, you are a mess.
Posted on Reply
#8
notb
Chaitanya said:
One bad product release after another now onto ripping off competitors.
Ripping off?
Even in this article you'll find information that Intel has already used this tech in 2016.

You're such an Intel fanboy. Get a grip.
Posted on Reply
#9
PowerPC
notb said:
Ripping off?
Even in this article you'll find information that Intel has already used this tech in 2016.

You're such an Intel fanboy. Get a grip.
Intel fanboy ripps on Intel... typo?

But they're obviously just parroting what AMD is doing at this point. You don't have to be a fanboy to see what's so obvious.
Posted on Reply
#10
Boosnie
This is not an industrial seecret, it's a simple mesh interconnection.
They are simply doing what they should have done years ago but didn't because the money was flowing in anyway.
Now they are investing money to make their product better and more competitive (with a very obvious and cheap way to rise performance).

Let's all rejoice AMD hit good and hard with a great product.
Posted on Reply
#11
Ripper3
PowerPC said:
Intel fanboy ripps on Intel... typo?

But they're obviously just parroting what AMD is doing at this point. You don't have to be a fanboy to see what's so obvious.
They've been using a mesh since the Xeon Phi Knights Landing, says so in this very article:
Technically, this new Mesh Interconnect tech from Intel was introduced with the Xeon Phi Knight's Landing based products
If you're going to try something new, there's no better (worse) place to start than an expensive, targeted product used by the most exacting customers, on a highly complex core. If it works for a 72-core processor, it's probably okay on a 22-core or less processor.
Posted on Reply
#12
notb
PowerPC said:
Intel fanboy ripps on Intel... typo?
No. @Chaitanya suggested that in few months Intel designed a solution that took AMD few years. Clearly, he must think of Intel as god of electronics.
PowerPC said:

But they're obviously just parroting what AMD is doing at this point. You don't have to be a fanboy to see what's so obvious.
In marketing? Yes.
Mesh interconnect was already used in a 2016 Xeon Phi, but was merely a complicated technical feature - hidden deeply in the architecture documentation. Now they've pushed it into spotlight - exactly what AMD did with Infinity Matrix.

Personally, I'm not a huge fan of what's happening. I don't like the whole Ryzen marketing campaign (flashy, gaming-themed) and I really hope Intel will keep their cool, business-oriented image. But I guess this CPU segment is meant for geeks mostly (like people on TPU :) ) and they need to push some technical stuff to the surface. Enough to generate topics like this one, but not very technical, so that you don't have to be a physicist or electrical engineer to participate in a discussion.

To begin with, I like the fact that Intel kept the original name (mesh interconnect) from Xeons and didn't replace it with something more "sci-fi". Honestly, "Infinity Matrix" sounds like a weapon from Marvel Universe.
Ripper3 said:

If you're going to try something new, there's no better (worse) place to start than an expensive, targeted product used by the most exacting customers, on a highly complex core. If it works for a 72-core processor, it's probably okay on a 22-core or less processor.
Exactly. They used it in enterprise-grade systems and had a year for further improving. It should be relatively problem-free compared to Zen.
Posted on Reply
#13
Vasrass
Most important thing is missing:

When will it be available in Xeon/Skylake-X mobos and actually ship?

We don't need more paper launches and I doubt many here will buy Knights Landing for gaming.

If it was ready, Intel would have announced it now.

Second most important question:

Will this obsolete X299 platform before it is even out?
Posted on Reply
#14
theeldest
So if AMD's Infinity Fabric caused headaches for OSes and scheduling, how will this fare? At least with rings and CCXs it's easy to predict the penalty for going from Core-X to Core-Y. Here it'll be much more complicated as it will be dependant on how much other traffic is going on between cores.

Additionally, the nodes near the 'middle' of the mesh will have much more traffic going through them then any of the 'edge' nodes. When this scales large enough (100+) that effect will somewhat dissipate but for <20 cores it will probably be substantial.
Posted on Reply
#15
Captain_Tom
TheGuruStud said:
Intel is really going with the full on Chinese approach to business, eh? :D

Their decisions sure are interesting. They could have done this years ago, but didn't. Now, it's all, "See, we have this stuff, too, look at us."

It must only be a matter of time (new architecture) until they direct copy and start slapping together MCMs, since yields are killing them on these large core counts.
There is no way this is just "ready". MCM's are so economical that Intel would have used them a DECADE ago if they could.


My guess is they started working on this tech a few years ago when they realized they would need it, and they still won't have it perfected to the level AMD has it for another 2 years at least.
Posted on Reply
#16
TheLaughingMan
theeldest said:
So if AMD's Infinity Fabric caused headaches for OSes and scheduling, how will this fare? At least with rings and CCXs it's easy to predict the penalty for going from Core-X to Core-Y. Here it'll be much more complicated as it will be dependant on how much other traffic is going on between cores.

Additionally, the nodes near the 'middle' of the mesh will have much more traffic going through them then any of the 'edge' nodes. When this scales large enough (100+) that effect will somewhat dissipate but for <20 cores it will probably be substantial.
It didn't cause any issues with the scheduler. That was 1 or 2 Youtuber's who know nothing about OSes and schedulers blaming the first thing that came up in their "research. That is why there was never a patch from MS or AMD. AMD released a "patch" that was just a power management profile to adjust how Win10 handled lower power states.
Posted on Reply
#17
theeldest
TheLaughingMan said:
It didn't cause any issues with the scheduler. That was 1 or 2 Youtuber's who know nothing about OSes and schedulers blaming the first thing that came up in their "research. That is why there was never a patch from MS or AMD. AMD released a "patch" that was just a power management profile to adjust how Win10 handled lower power states.
It required significant engineering effort to release the windows and linux patches that supported their NUMA topology. These patches were deployed long before Zen released (I think the Linux patches were in 2015?)

But support still need to be engineered and added to the OS.
Posted on Reply
#18
notb
theeldest said:
So if AMD's Infinity Fabric caused headaches for OSes and scheduling, how will this fare?
Normally. It'll most likely "just work" - like Intel's stuff usually does, especially if it is first used in enterprise products.

Lets just repeat this once again, because it seems many of you didn't read the whole text (or previous comments).
You talk about this tech like if it was still being developed and a huge mystery. Mesh interconnect exists - Intel has already used it in their top CPUs. Now it's simply being implemented in cheaper models.
Captain_Tom said:

My guess is they started working on this tech a few years ago when they realized they would need it, and they still won't have it perfected to the level AMD has it for another 2 years at least.
Prepare to be surprised.
Posted on Reply
#19
TheLaughingMan
theeldest said:
It required significant engineering effort to release the windows and linux patches that supported their NUMA topology. These patches were deployed long before Zen released (I think the Linux patches were in 2015?)

But support still need to be engineered and added to the OS.
Yeah. as far as I recall, NUMA support up to 64 cores was added to Windows 7/Server 2008 R2 long before Windows 10 was even a thing. <- that is why I said what I said. The "issue" Zen had was not related to the scheduler. That was never the problem.
Posted on Reply
#20
AlB80
Wow. "ATI Radeon HD 2900 memory controller" picture used as Intel new "Mesh Interconnect".
Whats going here?
Posted on Reply
#21
Prima.Vera
So basically it means that only 2 to 4 Cores can communicate at once?
Posted on Reply
#22
notb
Prima.Vera said:
So basically it means that only 2 to 4 Cores can communicate at once?
To be honest, it's always like that: only 2 cores can communicate at once. :-P
In a mesh topology each node (core) communicates only with its neighbours. This solution uses a rectangular grid model, so each core has 4 neighbours.
Posted on Reply
#23
close
TheGuruStud said:
Intel is really going with the full on Chinese approach to business, eh? :D

Their decisions sure are interesting. They could have done this years ago, but didn't. Now, it's all, "See, we have this stuff, too, look at us."

It must only be a matter of time (new architecture) until they direct copy and start slapping together MCMs, since yields are killing them on these large core counts.
To be fair they did MCM before AMD, as a last ditch attempt to beat AMD to market with a dual-core desktop CPU in Pentium D days (Smithfield).
Posted on Reply
#24
djemergenc
There is an executive at Intel that has an undiagnosed condition somewhere...
Posted on Reply
Add your own comment