• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Core i7-11700K "Rocket Lake" CPU Outperforms AMD Ryzen 9 5950X in Single-Core Tests

Joined
Nov 6, 2016
Messages
666 (0.42/day)
Location
NH, USA
System Name Lightbringer
Processor Ryzen 7 2700X
Motherboard Asus ROG Strix X470-F Gaming
Cooling Enermax Liqmax Iii 360mm AIO
Memory G.Skill Trident Z RGB 32GB (8GBx4) 3200Mhz CL 14
Video Card(s) Sapphire RX 5700XT Nitro+
Storage Hp EX950 2TB NVMe M.2, HP EX950 1TB NVMe M.2, Samsung 860 EVO 2TB
Display(s) LG 34BK95U-W 34" 5120 x 2160
Case Lian Li PC-O11 Dynamic (White)
Power Supply BeQuiet Straight Power 11 850w Gold Rated PSU
Mouse Glorious Model O (Matte White)
Keyboard Royal Kludge RK71
Software Windows 10
Here comes the Ryzen 5000 XT series on 7nm EUV (improved node)...willing to bet on it
 
Joined
Apr 24, 2020
Messages
739 (2.38/day)
I have no issue with the possibility of increasing the decoder width or even adding more execution ports. But I question how likely it is, if Cypress Cove is basically a backport of Sunny Cove, since these kinds of changes usually require a total overhaul of the cache, register files and everything on the front-end.

Wouldn't it be more likely to backport something from the execution side from e.g. Sapphire Rapids, or to simply add more execution units on existing execution ports? (like one extra MUL unit?)

I think ARM has an advantage on decoder width. That's the only weak point of the x86 ISA I can think of.

x86 requires a byte-by-byte decoder, because you have 2-byte, 3-byte, 4-byte... 15-byte instructions (some of which are macro-op fused and/or micro-op split). ARM standardized upon 4-byte instructions with an occasional 8-byte macro-op fused. That means if you want to perform 4-wide decoding (and assume an average of 4-bytes per instruction), you need 64-parallel decoders: one for every byte (byte0, byte1, byte2) of the cache line.

ARM on the other hand is always 4-bytes or 8-bytes at a time (in the case of macro-op fused operations). Which means for a 64-byte decoder, ARM only need 16-parallel decoders: knowing there's no 2-byte or 3-byte instructions that could be "in-between". Just hypothetically speaking of course, I dunno really how these things are organized.

Anyway: Apple M1 shot a broadside at the x86 camp with their 8-wide decoder. I do think its relevant to bring up. However, ARM Neoverse is still only 4-wide decoding. It hasn't really been proven yet that an ultra-wide decoder (like Apple's M1) is really the best path forward.
 
Joined
Sep 1, 2020
Messages
272 (1.52/day)
Location
Bulgaria
LoL I see that X86 is ok with better decoder. But isn't possible to make better decoder because has depencies how work ISA with information. This is same as ISA X86 is not ok itself.
 
Joined
Mar 17, 2011
Messages
113 (0.03/day)
Location
Christchurch, New Zealand
tbh the “next big thing” should be Alder Lake, not Rocket Lake.


Let’s be honest: it almost was a paper launch with skyrocket prices for Zen 3.
I don’t know about new zeland, but here in Europe it is very difficult to find one at a decent price.

Well, what's a decent price? This OK? Considering that you guys suffer more with higher taxes on stuff, it's probably a decent price.
 
Joined
Jul 26, 2019
Messages
411 (0.70/day)
Processor R5 5600X
Motherboard Asus TUF Gaming X570-Plus
Memory 32 GB 3600 MT/s CL16
Video Card(s) Sapphire Vega 64
Storage 2x 500 GB SSD, 2x 3 TB HDD
Case Phanteks P300A
Software Manjaro Linux, W10 if I have to
Well... they hit 5Ghz on 14nm++++++ which we know they can do quite easily. Let's see if Alder Lake hits 5Ghz at 10nm.
Oh, silly me for thinking Intel was finally on a new process lmao
 
Joined
Mar 7, 2010
Messages
774 (0.19/day)
Location
Michigan
System Name Daves
Processor AMD Ryzen 3900x
Motherboard AsRock X570 Taichi
Cooling Enermax LIQMAX III 360
Memory 32 GiG Team Group B Die 3600
Video Card(s) Powercolor 5700 xt Red Devil
Storage Crucial MX 500 SSD and Intel P660 NVME 2TB for games
Display(s) Acer 144htz 27in. 2560x1440
Case Phanteks P600S
Audio Device(s) N/A
Power Supply Corsair RM 750
Mouse EVGA
Keyboard Corsair Strafe
Software Windows 10 Pro
Got to love that headline..:shadedshu:
 
Joined
May 10, 2020
Messages
410 (1.39/day)
Processor Ryzen 7 5800X
Motherboard Asus ROG Strix B550-F Gaming
Cooling Noctua NH-D15
Memory 32 Gb G.Skill TridentZ RGB 3600CL16
Video Card(s) Zotac RTX 3080 AMP Holo
Storage 1 Tb Samsung 970 EVO NVMe + 1 Tb SSD WD Blue
Display(s) MSi Optix G27 + Samsung C24RG50
Case Corsair 5000D Airflow
Power Supply EVGA G3 750W
Mouse Razer Basilisk
Keyboard Razer Ornata Chroma
Benchmark Scores 3dMark TimeSpy - 16483(CPU 12076 - GPU 17618) Cinebench R20 - 621/6081
Well, what's a decent price? This OK? Considering that you guys suffer more with higher taxes on stuff, it's probably a decent price.
It is one of the best prices in Europe... if you consider UK as part of the Europe (which is not...).
By the way that website is one of the best, but for some items they don’t ship outside UK
 
Joined
Mar 17, 2011
Messages
113 (0.03/day)
Location
Christchurch, New Zealand
It is one of the best prices in Europe... if you consider UK as part of the Europe (which is not...).
By the way that website is one of the best, but for some items they don’t ship outside UK

The UK is not part of the European Union. It will always be a part of Europe by dint of its geographical location. In any case, the offer is not availble to you if you're not a UK resident. But if you knew someone in the UK that might do you a favour....
 
Joined
May 10, 2020
Messages
410 (1.39/day)
Processor Ryzen 7 5800X
Motherboard Asus ROG Strix B550-F Gaming
Cooling Noctua NH-D15
Memory 32 Gb G.Skill TridentZ RGB 3600CL16
Video Card(s) Zotac RTX 3080 AMP Holo
Storage 1 Tb Samsung 970 EVO NVMe + 1 Tb SSD WD Blue
Display(s) MSi Optix G27 + Samsung C24RG50
Case Corsair 5000D Airflow
Power Supply EVGA G3 750W
Mouse Razer Basilisk
Keyboard Razer Ornata Chroma
Benchmark Scores 3dMark TimeSpy - 16483(CPU 12076 - GPU 17618) Cinebench R20 - 621/6081
The UK is not part of the European Union. It will always be a part of Europe by dint of its geographical location. In any case, the offer is not availble to you if you're not a UK resident. But if you knew someone in the UK that might do you a favour....
yep... I was speaking about not being in Europe in a commercial way, not geographically :D
 
Joined
Dec 22, 2011
Messages
3,435 (1.02/day)
System Name I'm sorry Dave, I'm afraid I can't do that.
Processor AMD Ryzen 7 3700X
Motherboard MSI MAG B550 TOMAHAWK
Cooling AMD Wraith Prism
Memory Team Group Dark Pro 8Pack Edition 3600Mhz CL16
Video Card(s) Palit GTX 980 Ti Super JetStream
Storage Kingston A2000 1TB + Seagate HDD workhorse
Display(s) Crossover 27Q 27" 2560x1440 + Hisense 43" 4K
Case Antec 1200
Audio Device(s) Don't be silly
Power Supply XFX 650W Core
Mouse Razer Deathadder Chroma
Keyboard Logitech UltraX
Software Windows 10
Benchmark Scores Epic
Here comes the Ryzen 5000 XT series on 7nm EUV (improved node)...willing to bet on it

Oh God, lets hope not, 2% more performance for way more money and they don't include a cooler as a bonus!
 
Joined
Jan 27, 2006
Messages
1,022 (0.19/day)
tbh the “next big thing” should be Alder Lake, not Rocket Lake.
You missed the "since 2015" part. Intel have been using the same Skylake design all these years, and Rocket Lake is the departure from that. Alder Lake may be better, but it's the next big thing after RL.
 
Joined
Mar 10, 2010
Messages
8,676 (2.17/day)
Location
Manchester uk
System Name RyzenGtEvo/ Asus strix scar II
Processor Amd R7 3800X@4.350/525/ Intel 8750H
Motherboard Crosshair hero7 @bios 2703/?
Cooling 360EK extreme rad+ 360$EK slim all push, cpu Monoblock Gpu full cover all EK
Memory Corsair Vengeance Rgb pro 3600cas14 16Gb in two sticks./16Gb
Video Card(s) Sapphire refference Rx vega 64 EK waterblocked/Rtx 2060
Storage Silicon power qlc nvmex3 in raid 0/8Tb external/1Tb samsung Evo nvme 2Tb sata ssd
Display(s) Samsung UAE28"850R 4k freesync, LG 49" 4K 60hz ,Oculus
Case Lianli p0-11 dynamic
Audio Device(s) Xfi creative 7.1 on board ,Yamaha dts av setup, corsair void pro headset
Power Supply corsair 1200Hxi
Mouse Roccat Kova/ Logitech G wireless
Keyboard Roccat Aimo 120
Software Win 10 Pro
Benchmark Scores 8726 vega 3dmark timespy/ laptop Timespy 6506
yep... I was speaking about not being in Europe in a commercial way, not geographically :D
Yep well 6800s most geforce's n in fact most GPU are scarce in the UK a rx580 sells for 220£ new still.
Plenty of intel CPU but few others in stock and favourite ones like 5900x aren't about anymore.
Oh to be rich.
 
Joined
Apr 24, 2020
Messages
739 (2.38/day)
Yep well 6800s most geforce's n in fact most GPU are scarce in the UK a rx580 sells for 220£ new still.
Plenty of intel CPU but few others in stock and favourite ones like 5900x aren't about anymore.
Oh to be rich.

AMD is supply constrained. They were only planning to reach 10% or 15% marketshare around now, and didn't expect that their chips would be such a hit. If AMD produced too many chips, they could risk bankruptcy as well as damage to their brand.

The Xilinx purchase probably helps: since it will give them a stable source of revenue, allowing them to play a bit more aggressive in the months and years ahead.
 
Joined
Jan 8, 2017
Messages
6,476 (4.29/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Deepcool Gammaxx L240 V2
Memory 16GB - Corsair Vengeance LPX - 3333 Mhz CL16
Video Card(s) OEM Dell GTX 1080 with Kraken G12 + Water 3.0 Performer C
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Deepcool Matrexx 70
Power Supply GPS-750C
I think ARM has an advantage on decoder width. That's the only weak point of the x86 ISA I can think of.

x86 requires a byte-by-byte decoder, because you have 2-byte, 3-byte, 4-byte... 15-byte instructions (some of which are macro-op fused and/or micro-op split). ARM standardized upon 4-byte instructions with an occasional 8-byte macro-op fused. That means if you want to perform 4-wide decoding (and assume an average of 4-bytes per instruction), you need 64-parallel decoders: one for every byte (byte0, byte1, byte2) of the cache line.

ARM on the other hand is always 4-bytes or 8-bytes at a time (in the case of macro-op fused operations). Which means for a 64-byte decoder, ARM only need 16-parallel decoders: knowing there's no 2-byte or 3-byte instructions that could be "in-between". Just hypothetically speaking of course, I dunno really how these things are organized.

Anyway: Apple M1 shot a broadside at the x86 camp with their 8-wide decoder. I do think its relevant to bring up. However, ARM Neoverse is still only 4-wide decoding. It hasn't really been proven yet that an ultra-wide decoder (like Apple's M1) is really the best path forward.

Regardless, I do not think x86 designs are limited in any way by current decode width. Actually, they clearly can't be since the back end on these CPUs keeps getting wider and wider and there don't seem to be any problem feeding all of those execution ports. And after all x86 is still more compact when it comes down how much decode is necessary to get the same amount of work done.

Apple's obsession with a ultra wide front end (and ultra wide everything really) seems to be rather wasteful, there is no obvious reason why that's actually required at the moment, I bet you everything that with half the decode stage the performance regression would be marginal.

I mean most of the performance that's worth extracting through ILP sits in loops and those don't put pressure on the decode stage because you'll keep hitting the instruction cache anyway which is colossal on something like M1. Actually the more I think about it the more absurd Apple's design choices appear to me.
 
Joined
Apr 24, 2020
Messages
739 (2.38/day)
I mean most of the performance that's worth extracting through ILP sits in loops and those don't put pressure on the decode stage because you'll keep hitting the instruction cache anyway which is colossal on something like M1. Actually the more I think about it the more absurd Apple's design choices appear to me.

Well, the uOp cache could be used for other registers, or L1 cache instead. So I'm not sure of the existence of the uOp cache is favorable to your argument.
 
Joined
Jan 8, 2017
Messages
6,476 (4.29/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Deepcool Gammaxx L240 V2
Memory 16GB - Corsair Vengeance LPX - 3333 Mhz CL16
Video Card(s) OEM Dell GTX 1080 with Kraken G12 + Water 3.0 Performer C
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Deepcool Matrexx 70
Power Supply GPS-750C
Well, the uOp cache could be used for other registers, or L1 cache instead. So I'm not sure of the existence of the uOp cache is favorable to your argument.

It's really, really, unlikely for a real world single program to have it's instructions pushed out of the micro-op cache, or if there is some kind of weird uop count that makes the whole mechanism inefficient. I am not saying it doesn't happen but the micro-op cache is typically the least of your worries.
 
Joined
Apr 24, 2020
Messages
739 (2.38/day)
It's really, really, unlikely for a real world single program to have it's instructions pushed out of the micro-op cache. I am not saying it doesn't happen but the micro-op cache is typically the least of your worries.

Oh, what I'm saying is that the M1 has an advantage, because its decoder is 8-wide, while Intel / AMD has a disadvantage, because their uOp caches are only 6-wide. Instead of having a uOp cache, the M1 spent its transistors on more L1 cache and a larger register-file.

The M1 can fit 192kB into its L1 i-cache, which performs a bit faster than the Intel/AMD uOp cache thanks to 8-way decoding. Intel / AMD only have 48kB i-L1 (for Rocket Lake) or 32kB i-L1 (AMD Zen 3), and smaller than that for its uOp cache.

-------

EDIT: I should say "Seems to have an advantage". Its not very clear if Apple's big decoder is a good strategy yet IMO. But its interesting, and worth keeping an eye on. Especially because it seems like an area that may be harder to implement into x86.
 
Last edited:
Joined
Jan 8, 2017
Messages
6,476 (4.29/day)
System Name Good enough
Processor AMD Ryzen R7 1700X - 4.0 Ghz / 1.350V
Motherboard ASRock B450M Pro4
Cooling Deepcool Gammaxx L240 V2
Memory 16GB - Corsair Vengeance LPX - 3333 Mhz CL16
Video Card(s) OEM Dell GTX 1080 with Kraken G12 + Water 3.0 Performer C
Storage 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) 4K Samsung TV
Case Deepcool Matrexx 70
Power Supply GPS-750C
Oh, what I'm saying is that the M1 has an advantage, because its decoder is 8-wide, while Intel / AMD has a disadvantage, because their uOp caches are only 6-wide. Instead of having a uOp cache, the M1 spent its transistors on more L1 cache and a larger register-file.

I really doubt that a micro-op cache is that much of an expensive mechanism to add definitely not comparable to the size and power of a larger L1 I-cache, they probably didn't add one because it just wasn't required, you're really gonna tell me that Apple is that conscious about their transistor budget ? :) And after all a lot of CPUs out there don't have one either, it's a pretty recent addition.

What I am also saying it that I haven't actually seen any evidence that such a wide decoder is actually worth it. Wide decode means a lot of delay in the circuitry which means poor clock speed scaling.
 
Joined
Nov 4, 2005
Messages
10,722 (1.92/day)
System Name MoFo 2
Processor AMD PhenomII 1100T @ 4.2Ghz
Motherboard Asus Crosshair IV
Cooling Swiftec 655 pump, Apogee GT,, MCR360mm Rad, 1/2 loop.
Memory 8GB DDR3-2133 @ 1900 8.9.9.24 1T
Video Card(s) HD7970 1250/1750
Storage Agility 3 SSD 6TB RAID 0 on RAID Card
Display(s) 46" 1080P Toshiba LCD
Case Rosewill R6A34-BK modded (thanks to MKmods)
Audio Device(s) ATI HDMI
Power Supply 750W PC Power & Cooling modded (thanks to MKmods)
Software A lot.
Benchmark Scores Its fast. Enough.
Oh, what I'm saying is that the M1 has an advantage, because its decoder is 8-wide, while Intel / AMD has a disadvantage, because their uOp caches are only 6-wide. Instead of having a uOp cache, the M1 spent its transistors on more L1 cache and a larger register-file.

The M1 can fit 192kB into its L1 i-cache, which performs a bit faster than the Intel/AMD uOp cache thanks to 8-way decoding. Intel / AMD only have 48kB i-L1 (for Rocket Lake) or 32kB i-L1 (AMD Zen 3), and smaller than that for its uOp cache.

-------

EDIT: I should say "Seems to have an advantage". Its not very clear if Apple's big decoder is a good strategy yet IMO. But its interesting, and worth keeping an eye on. Especially because it seems like an area that may be harder to implement into x86.


And the fact that ARM is essentially hardware based accelerators taped together with good power gating on the most advanced nodes. X86-64 has the advantage of is you want to do X in the future software and brute force will do it, ARM designs..... You need to buy a whole new device.

What happens when 8K or whatever is next becomes a thing? Apple products become obsolete and cheap, which is why a used Ipad Pros get sold for dirt cheap. 2-3 year old one is now $270 VS the initial price of $1k. Almost as bad as other ASIC hardware like GPU's, but 1K of X86 hardware will retain its value longer, and you can upgrade RAM and GPU's, increase storage and it just works.
 
Joined
Apr 24, 2020
Messages
739 (2.38/day)
I really doubt that a micro-op cache is that much of an expensive mechanism to add definitely not comparable to the size and power of a larger L1 I-cache, they probably didn't add one because it just wasn't required, you're really gonna tell me that Apple is that conscious about their transistor budget ? :) And after all a lot of CPUs out there don't have one either, it's a pretty recent addition.

What I am also saying it that I haven't actually seen any evidence that such a wide decoder is actually worth it. Wide decode means a lot of delay in the circuitry which means poor clock speed scaling.

I guess I feel like the uop cache in x86 (both Skylake and Zen3) is because of the decode width problem. In performance-critical sections, Skylake / Zen3 go from 4-uops / tick (from the decoder) to 6-uops/tick (from the uop cache). In effect: its a way for x86 to reach higher uops/tick... but only in select areas of code (the areas that fit inside a uop cache).

Apple has a superior decoder: just 8-uops/tick no matter what. Its the "more expensive transistor budget" compared to a uop cache. Apple can achieve 8uops/tick across the entire 192kB L1 instruction cache, while Intel Skylake / AMD Zen3 can only achieve 4-uops/tick across a 48kB L1 cache (Skylake) / 32kB L1 (Zen3) cache, and a 6-uop/tick across a smaller region inside of the uOp cache.
 
Joined
Nov 7, 2016
Messages
98 (0.06/day)
Processor i7 6800K
Motherboard X99 Pro Carbon
Cooling Silver Stone TD02-E, 3 Phanteks 140mm fans
Memory Trident Z 3200C16D-16G
Video Card(s) 1080 G1
Storage 960 Evo 500GB, 850 Evo 500GB, 3TB Western Digital Black
Display(s) Dell 2713H
Case in Win 509
Power Supply HX750i
Mouse Razer DeathAdder Chroma
Keyboard Logitech G710+
Software W10
There goes AMD's brief lead in gaming. :roll:

But it was never a real lead since the Ryzen 5000 launch was a paper launch.

The 5950X has been sitting on my desk for about month now, as I have been awaiting the shipment of the Dark Hero. When I placed the order, I saw a lot of high end motherboards had been sold out, a few gaming monitors were also sold out, also.
 
Joined
Jan 27, 2015
Messages
581 (0.26/day)
System Name Legion
Processor i5-10400
Motherboard Asus Prime Z490M Plus
Cooling Air
Memory G.Skill Ripjaws V 32GB (2 x 16GB) DDR4-3200 F4-3200C16D-32GVK
Video Card(s) EVGA GeForce RTX 2060 KO Ultra
Storage Inland Premium 256GB SSD 3D NAND M.2 2280 PCIe NVMe 3.0 x4 + WD Blue 1TB SATA SSD
Display(s) Acer K272HUL 1440p / 34" MSI MAG341CQ 3440x1440
Case Lian Li 205M
Power Supply PowerSpec 650W 80+ Bronze Semi-Modular PS 650BSM
Mouse Logitech MX Anywhere 25
Keyboard Logitech MX Keys
Software Lots
Joined
Oct 7, 2020
Messages
90 (0.63/day)
Location
N California
Processor 5930k @ 3.7 normally; can do 4.1 stable
Motherboard asus rog rampage 5
Cooling noctua server two fan air
Memory 16GB @2133
Video Card(s) 1670 ti
Storage several SSD adding nvme soon
Display(s) lg 77" cx 4k, lg 55 b8 4k
Case big tower lots of fans including side and top fans
Audio Device(s) on board
Power Supply seasonic 1200watt
Mouse adjustable vertical
Keyboard really cheap
Software windows 8.1, Linux -dual boot different drives
They should call them Turbo Rockets--they'd sell like hot cakes and maybe cook them too.
But seriously I'm starting to really feel, between this and AMD's new offerings, the upgrade itch. Just need a little patience for all these new goodies to become obtainable.
 
Joined
Dec 31, 2009
Messages
19,301 (4.74/day)
Benchmark Scores Faster than yours... I'd bet on it. :)
I started reading this thread and all I saw were people taking jabs at Intel... ridiculous. This forum man... I swear........ :(

Anyway, who knows how true this is... but it's a good sign so far. Wondering what the power draw will be (more than AMD I'd guess), but if IPC is back up there along with clocks and they keep the more reasonable pricing... it sounds like a solid option in the market to me...
 
Top