As mentioned before, the research focus is far ahead, even of what the Intel product engineers are working on. The only way to be able to stuff that much performance into a small area (volume?) is by reducing the heat output considerably. One way is sure to reduce the voltage applied. Another way is of course to reduce the process size. Another possibility is to reduce the operating frequency and to use other power saving techniques.
A completely new area on which there has been not much research is the greater field of resilience of such processors.
If you have a processor made up of a huge number of very small identical units you can dynamically turn selected ones on and off to conserve power. Or you could produce one CPU with 100 cores and sell lower cost versions with a different number of enabled cores. This could go even further that in the future maybe you will buy a processor with 100 cores in it and pay only for the actual number of cores used.
During production some small parts of a wafer will yield defective CPUs because there are tiny impurities in the silicon. In a traditional CPU design this means that the core using that silicon will not work and has to be thrown away. Depending on the production process maturity this can be a two-digit percentage. With independent small cores, you could just disable the defective ones and still end up with a working processor. To be able to deliver the performance of the product you would need to plan in a number of spare CPUs that are usually unused. The spare CPUs could also be slowly used up during the life time of a processor. If even a single transistor in a CPU of today stops working, the whole CPU is useless.
As you can see there are lot of new possibilities imaginable with such new technologies and I am sure that we will see more once the first products are becoming ready.
Maybe you have already wondered "all that processing power is sure great, but how do we get all the input into the CPUs? It would be hundreds of Gbit/s+++".
Again we have the MHz problem. Just clocking any traditional external bus at THz does not work, so Intel is researching several methods for data transfer. The first is stacked memory, which means that right on top of the processor die you have another silicon die with memory. Since the distance between both endpoints is almost nothing you can run much higher clock speeds on such an interface. Also the shared area is large which allows a lot of parallel connections. The next approach is trying to increase the clock frequencies on a traditional copper interface, but going beyond 10 GHz seems to be extremely hard here. For huge bandwidth requirements fiber optics are one of the most promising solutions. Until recently the major problem was that creating laser light to feed those fibers was a complex and expensive process that does not ramp up with mass production very well. That's why you see fiber optic connections only in cases where it is cheaper than a traditional copper interlink. Intel is a pioneer in Silicon Laser development. A silicon laser is basically working with silicon instead of other expensive optical materials. Since processing silicon is an easy and well mastered process this should bring down the cost of laser components a lot and allow easy integration with CPUs, memory other any other silicon based device.