• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Raptor Lake Processor with 34 P-Cores Spotted

Games wont ever need more than 12 cores/threads, ever. Many game devs are still struggling with using more than 4 cores.

Even GTA V from 2014 can use 16 threads...
 
For me it could be the following scenario regarding meaningful use of core/threads for 99% if cases (1 tier down for some odd cases that makes better multithreading usage) , if in 2028 Xbox series X2/PS6 has double the core/threads vs current gen and in PC client space we see similar increase also by then:

2034-203816C/32T
2029-203312C/24T
2025-20288C/16T
2021-20246C/12T
2017-20204C/8T
2013-20164C/4T
2009-20122C/4T
2006-20082C/2T
2003-20051C/2T
2000-20021C/1T
 
Last edited:
Even GTA V from 2014 can use 16 threads...
Yes, but 16 high performance threads? Not likely. Most games these days spawn tons of threads; typically 1-2 of those threads actually load your CPU noticeably, with 3-4 heavy ones becoming more common.
 
Yes, but 16 high performance threads? Not likely. Most games these days spawn tons of threads; typically 1-2 of those threads actually load your CPU noticeably, with 3-4 heavy ones becoming more common.

CPU usage while playing GTA V
Screenshot 2022-09-28 202617.png


From the bottom right to the left are the 4 E cores.
 
CPU usage while playing GTA V
View attachment 263483

From the bottom right to the left are the 4 E cores.
Look at the utilization, though. It's not actually making full use of those cores; one thread is switching back and forth across the hyper-threads. Windows 10 and 11 do this on Intel and AMD hardware and I don't know why, it's really strange. But anyway, I've tested it exhaustively, and GTA V will only actually benefit from 8 cores. Higher-core-count chips don't actually gain performance at all in GTA V.
 
Higher-core-count chips don't actually gain performance at all in GTA V.
Well I noticed I don't have any longer a CPU bottleneck.... The i7 12700K keeps my 2070 Super in 97~100% load throughout the game.... At 1440p...
My i7 6700K @ 4.5Ghz was unable to do that...
 
Well I noticed I don't have any longer a CPU bottleneck.... The i7 12700K keeps my 2070 Super in 97~100% load throughout the game.... At 1440p...
My i7 6700K @ 4.5Ghz was unable to do that...
Your 6700K has a 50% or more ST deficit even with the overclock.
 
Even GTA V from 2014 can use 16 threads...
Using and needing are two different things here. GTA doesnt fully utilize 16 cores, or 8 cores/8 threads, it does not need that many, besides GTA V is one of very very few games that use more than 6 cores.
 
CPU usage while playing GTA V
View attachment 263483

From the bottom right to the left are the 4 E cores.
So, I see two heavily loaded threads (core 2/thread 2, core 8/thread 14), one slightly lighter (core 6/thread 10), a bunch of variable medium loads, and a bunch of HT threads (which are much lower performance than both the main thread and an E core) loaded anywhere from 0% to ~50%. That seems to align pretty well with what I was saying.
 
That are some games that can already see a performance benefit from having more than 8 cores. To say that games with never need more than 12 cores is certain to be disproven.
see above, using and needing...
 
Well I noticed I don't have any longer a CPU bottleneck.... The i7 12700K keeps my 2070 Super in 97~100% load throughout the game.... At 1440p...
My i7 6700K @ 4.5Ghz was unable to do that...
So you went from 4c8t to 12c20t, saw a performance increase, and see that as proof that games are using more than 3-4 high performance threads? Sorry, but that doesn't add up. That 6700K had 4 high performance threads, and it still had to run your OS and background applications. If a game was then also running 3-4 high performance threads, obviously there would be a bottleneck, especially if that same game spawned 12+ threads that needed shuffling around.
 
Whatever you guys trying to say, I know now from my own experience that my current CPU is no longer a bottleneck....:)
 
Whatever you guys trying to say, I know now from my own experience that my current CPU is no longer a bottleneck....:)
Oh, it absolutely isn't. But a fast 6c12t would likely not have been much of one either, nor an 8c16t.
 
If Intel were to put AVX512 on both their e & p cores we could see Xeon Phi levels of performance.
 
see above, using and needing...

At some point it's feasible that a 12 core will be needed to have a smooth gaming experience.

It really depends on your definition of need. One person can take the position that they are fine with 12 FPS and don't need more and that would be valid because what they need is entirely up to them. I'm going off more the general consensus of the market in what is needed to have an enjoyable gaming experience.

It would be beneficial for the gaming market in general if games started to lean on more cores. It's far easier to scale up performance by adding cores while also keeping costs and power consumption down.
 
So, I see two heavily loaded threads (core 2/thread 2, core 8/thread 14), one slightly lighter (core 6/thread 10), a bunch of variable medium loads, and a bunch of HT threads (which are much lower performance than both the main thread and an E core) loaded anywhere from 0% to ~50%. That seems to align pretty well with what I was saying.
What do you mean here? When two threads are running on the same core, they have the same performance. The OS cannot change that because a core has no concept of thread priority. Here is some discussion about that.
 
What do you mean here? When two threads are running on the same core, they have the same performance. The OS cannot change that because a core has no concept of thread priority. Here is some discussion about that.


Since late revisions of XP Windows has been aware of shared core resources when dealing with HT capable processors, and scheduled threads to account for the available resource pool using "physical" cores (checking the thread for stalls/hard pages) before assigning a thread to any other core.

 
Intel can push many e cors with easy to gain back some of the HETD, maybe for much cheaper by saving a lot of silicon space.
e cores prove to do a good jobe in highly multitasking workload.
I can see
8+32\48\64
10+24\32\40
16+16\24\32
Yep Arrow Lake is reported as having up to 48 cores, 8P + 40E.
 
It would be beneficial for the gaming market in general if games started to lean on more cores. It's far easier to scale up performance by adding cores while also keeping costs and power consumption down.
I've been advocating this for years, even if there a limit what can be done with core counts within game development, thus far.
 
What do you mean here? When two threads are running on the same core, they have the same performance. The OS cannot change that because a core has no concept of thread priority. Here is some discussion about that.
Sorry, but no. Only if they are being run sequentially, not through SMT. SMT uses "spare" resources not used by the main thread, with some interleaving, meaning that unless you have the perfect task for a HT thread, it has much lower performance potential as there is less execution hardware available to it.
 
Intel Outs First Xeon Scalable "Sapphire Rapids" Benchmarks, On-package Accelerators Help Catch Up with AMD EPYC

Wouldn't this be half of the "Sapphire Rapids", or the chip after that ?

Intel are stating 60 cores for Sapphire Rapids, so this would be 30 cores, with 4 possible defective cores ?
Dual Die for the 60c variant ?

Or is this the 'stand alone' up-to 34 core chip, with just the one compute die ?
Given that the wafer is labeled Raptor Lake and not Sapphire Rapids, it would seem to be not quite the same - but it's likely very similar, yes. As I said above, they're likely test producing wafers of XCC dice of every core revision just to see how they perform for server/HPC purposes. AFAIK Sapphire Rapids uses a Golden Cove core with more cache, so it should be very similar to Raptor Lake, but it's not the same.
 
Sorry, but no. Only if they are being run sequentially, not through SMT. SMT uses "spare" resources not used by the main thread, with some interleaving, meaning that unless you have the perfect task for a HT thread, it has much lower performance potential as there is less execution hardware available to it.
Are you willing to do a benchmarking experiment on your 5800X? I can't , my i5-6600K has had HT disabled forever by its creator. You'd need to run two instances of a single-threaded benchmarking program such as Super Pi at the same time, one pinned (via affinity settings) to virtual CPU 0, the other to virtual CPU 1. Or any pair that belong to the same core. What results do you get, is one instance slower than the other?

I can't find any technical documents or discussion or benchmarking results that would confirm that there exists a "main thread" with higher priority and a "HT thread" with lower priority, so that the main thread would never be slowed down substantially, but the HT thread would "take whatever remains" of execution units and run very slowly. AnandTech has had some great articles on HT since 2002, with this one being the most recent, and there's no mention of the two threads being unequal.

The OS scheduled clearly knows the consequences of dispatching threads to fewer physical cores (preferring HT) vs. more physical cores (preferring no HT), but this is a different matter.
 
Since late revisions of XP Windows has been aware of shared core resources when dealing with HT capable processors, and scheduled threads to account for the available resource pool using "physical" cores (checking the thread for stalls/hard pages) before assigning a thread to any other core.

That's a great article, thanks. Also, what you said is true. (I just don't know how much Windows can check for stalls and hard page faults and such things. The scheduler has little time for a detailed analysis of execution, it must be fast.)

But here's what the article says: "When processes share a physical processor the sharing of resources, including the fetch and issue bandwidth, means that they both run slower than they would do if they had exclusive use of the processor." It would be relevant for the article if one ran much faster than the other, but there's no mention of that. The authors then observe the performance of pairs of processes, with pairs running on the same core, not individual processes.

So I still assume that two threads on the same core are executed with the same priority, at least on the x86 architecture. By contrast, IBM's POWER architecture has thread priorities support in the hardware:
The processor allows priorities to be assigned to hardware threads. The difference in priority between sibling threads determines the ratio of physical processor decode slots allotted to each thread. More slots provide better thread performance.
 
Back
Top