• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Apple Introduces M1 Pro and M1 Max: the Most Powerful Chips Apple Has Ever Built

Joined
May 17, 2021
Messages
472 (2.29/day)
Processor Ryzen 5 3600
Motherboard B550 Elite
Cooling Hyperx 212
Memory Hyperx fury 16Gb DDR4 3333Mhz
Video Card(s) 3060 ti gaming oc pro
Storage Samsung 970 Evo 1Tb plus some HDDs
Case Lian Li Lancool II performance
Power Supply MSI 750w
Mouse G502
Good for them. It would be nicer if it didn't came from Apple as i have absolutely no intention of buying anything from them.
 
Joined
Apr 24, 2020
Messages
1,396 (2.35/day)
The M1 Max, at least on paper, makes every other CPU seem like a decade out of date... How can this be?

The only crazy thing I'm seeing so far is the high LPDDR5 bandwidth of 400GBps.

I'm not really seeing anything else super-special about this actually. EDIT: 5nm is also cool, but that's largely TSMC + a function of Apple's money. TSMC is very advanced, and Apple can afford the best.
 
Last edited:
Joined
Nov 3, 2011
Messages
491 (0.13/day)
Location
Australia
System Name Eclipse P500A | Fractal Define R6
Processor AMD Ryzen 9 3900X | Intel Core i7-9900K @ 5 Ghz all cores
Motherboard ASUS ROG Strix X570 Gaming | MSI Z390 Gaming Pro Carbon AC
Cooling CORSAIR Hydro H115i, RGB | CORSAIR Hydro H150i RGB
Memory G.Skill Trident 32GB 3200 Mhz RGB| HyperX 32GB 3600 Mhz RGB
Video Card(s) MSI RTX 2080 Ti Gaming X TRIO 11 GB| MSI RTX 3080 Ti Gaming X TRIO 12 GB
Display(s) 2X LG 27UL600 27in 4K HDR FreeSync/G-Sync DP| LG 32UL950-W 32in 4K HDR FreeSync/G-Sync DP
Case Phanteks Eclipse P500A TG| Fractal R6 TG
Audio Device(s) Creative Sound Blaster Z | Creative Sound Blaster AE-7
Power Supply Seasonic 1000 watts| Seasonic 1000 watts
Mouse Bloody P95s
Keyboard Logitech G810s
Software MS Windows 11 Pro
That's pretty big. I'm curious how this memory system works.

Its big enough that I'm instinctively thinking that's a typo there. 400GB/s is huge for a CPU / iGPU. The only systems close to that are XBox / PS5 game consoles with GDDR graphics ram.
AMD 4700S has 256-bit GDDR6-14000 i.e. PS5 recycled APU for the PC market.
 

r9

Joined
Jul 28, 2008
Messages
2,918 (0.60/day)
System Name Primary | Secondary | Poweredge r410|Dell XPS
Processor i7 9700k| Ryzen 1600| 2 x E5620 |i5 5500U
Memory 16GB DDR4 |16GB DDR4 | 32GB ECC DDR3|8GB DDR4
Video Card(s) GTX 1070|2 x RX570 |On-Board|On-Board|
Storage 512GB SSD+1TB SSD|512GB SSD+1TB|2x256GBSSD 2x2TBGB
Display(s) 50" 4k TV | 27" + 2 x 24" LCD Setup
VR HMD Samsung Odyssey+
Software Windows 10 |Windows 10| Server 2012 r2
Oh shiiiiiiiiiiit ....
The M1 was running Witcher 3 x86 at 30fps, I really want to see what this monster can do with 4x gpu power on games that run natively.
 
Joined
Jun 5, 2021
Messages
180 (0.96/day)
We know how the M1 performs - in terms of IPC it trounces both Intel and AMD, matching or beating their peak single core performance at 2/3-3/5 the clock speed (3.1GHz vs 4.8/5.3-ish). These more than double the core counts, and double/quadruple the memory bandwidth to feed the cores. Also, Apple has absolutely insane amounts of cache (at equally insane latencies) with their recent chips. This will be a beast, it just needs software to make use of the power. Which it likely will have (Adobe CS etc. are already native).

The interfaces are 256-bit (M1 Pro) and 512-bit (M1 Max). Probably a bit power hungry, sure, but they are mounted extremely close to the SoC, on the same package, so they've likely optimized for that. Plus, these are 40-60W SoCs. The memory power isn't going to be an issue.
Rafael h and alder lake s will destroy overrated m1
 
Joined
Feb 8, 2021
Messages
76 (0.25/day)
Oh shiiiiiiiiiiit ....
The M1 was running Witcher 3 x86 at 30fps, I really want to see what this monster can do with 4x gpu power on games that run natively.
easy : look xbox serie S or ps5 : slide write 10.4 TFLOP
 
Joined
May 24, 2007
Messages
5,228 (0.98/day)
Location
Tennessee
System Name R9 / R7
Processor AMD Ryzen R9 5950X / AMD Ryzen R9 3950X
Motherboard MSI MEG X570 ACE AM4 / Gigabyte B550 Aorus Elite AM4
Cooling Noctua NH-D15 chromax.black
Memory Crucial Ballistix 64GB 3200 MHz DDR4 / Crucial Ballistix 32GB 3200 MHz DDR4
Video Card(s) AsRock RX 6900 XT(XH) OC Formula / PowerColor Red Devil RX 6900 XT(XH) Ultimate
Case Fractal Define R7 / BeQuiet 900
Power Supply Seasonic PRIME PX-1200, 1200W 80+ Platinum, Full Modular
Oh shiiiiiiiiiiit ....
The M1 was running Witcher 3 x86 at 30fps, I really want to see what this monster can do with 4x gpu power on games that run natively.

30 fps, 1080p, and lowest settings. Maybe you will get 30 fps, 1080p, and max settings.
 
Joined
Nov 3, 2011
Messages
491 (0.13/day)
Location
Australia
System Name Eclipse P500A | Fractal Define R6
Processor AMD Ryzen 9 3900X | Intel Core i7-9900K @ 5 Ghz all cores
Motherboard ASUS ROG Strix X570 Gaming | MSI Z390 Gaming Pro Carbon AC
Cooling CORSAIR Hydro H115i, RGB | CORSAIR Hydro H150i RGB
Memory G.Skill Trident 32GB 3200 Mhz RGB| HyperX 32GB 3600 Mhz RGB
Video Card(s) MSI RTX 2080 Ti Gaming X TRIO 11 GB| MSI RTX 3080 Ti Gaming X TRIO 12 GB
Display(s) 2X LG 27UL600 27in 4K HDR FreeSync/G-Sync DP| LG 32UL950-W 32in 4K HDR FreeSync/G-Sync DP
Case Phanteks Eclipse P500A TG| Fractal R6 TG
Audio Device(s) Creative Sound Blaster Z | Creative Sound Blaster AE-7
Power Supply Seasonic 1000 watts| Seasonic 1000 watts
Mouse Bloody P95s
Keyboard Logitech G810s
Software MS Windows 11 Pro
CPUs have to transfer data to the GPUs all the time (and sometimes rarely, maybe a GPU->CPU transfer). One of the key advantages of a SOC is that this "data transfer" takes place in L3 cache instead of over system memory.

I find it hard to believe that Microsoft would design a SOC like the XBox Series X and ignore this simple and useful optimization. I see that Microsoft is playing cute games with its 10+6 GB layout, but I'm pretty sure they're just saying that CPUs use less memory bandwidth, so 10GB of fast-RAM + 6GB of slow-RAM is intended for the CPU to use slow-RAM and GPU to use fast-RAM. But both CPU+GPU should have access to both halfs.

If for no other reason than to optimize the "no copy" methodology between CPU -> GPU data transfers. (Why ever copy data when GPUs can simply just read the RAM themselves?). In dGPU world, you need to transfer the data over PCIe because the VRAM is physically a different chip. But in XBox Series X land, VRAM and RAM are literally the same chips, no copying needed.
1. For games, the shared memory usage is relatively minor. PC has reBar resize that enabled PC CPU to directly access the entire GPU's VRAM. CPU wouldn't be able to keep up with dGPU's large-scale scatter-gather capability.

2. Shared memory has its downsides with context switch overheads. CPU IO access can gimp GPU's burst mode IO access e.g. frame buffer burst IO access shouldn't be disturbed.

Late 1980s Amiga's Chip Ram is shared memory between the CPU and iGPU (custom chips).
 

Attachments

  • Shared Memory allocaton.png
    Shared Memory allocaton.png
    510.9 KB · Views: 10
  • PS4-GPU-Bandwidth-140-not-176.png
    PS4-GPU-Bandwidth-140-not-176.png
    70.3 KB · Views: 10
Last edited:

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
12,440 (3.45/day)
Location
Concord, NH
System Name Apollo
Processor Intel Core i9 9880H
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Full Size Wireless Apple Magic Keyboard
Software MacOS 10.15.7
6K , with what sounds like 3.5k minimum spend for 32 GB ram, and in some cases it's said (look around I'm not posting links to other tech sites)to be beat by the outgoing Intel 9th gen chip's sooo, there's that.
Go one their site, I just priced it out. $4,200 for the Max, 64GB of memory, and a 2TB drive which is about what I paid (sans discounts I can get,) for mine in my specs. That's really not bad considering what you're getting if you're comparing it to the previous 16". In that respect, Apple has kept pricing consistent, but has theoretically given it an absolutely massive performance uplift within the same power constraints.

Edit: Mind you that these are US prices in USD.
 
Joined
Nov 3, 2011
Messages
491 (0.13/day)
Location
Australia
System Name Eclipse P500A | Fractal Define R6
Processor AMD Ryzen 9 3900X | Intel Core i7-9900K @ 5 Ghz all cores
Motherboard ASUS ROG Strix X570 Gaming | MSI Z390 Gaming Pro Carbon AC
Cooling CORSAIR Hydro H115i, RGB | CORSAIR Hydro H150i RGB
Memory G.Skill Trident 32GB 3200 Mhz RGB| HyperX 32GB 3600 Mhz RGB
Video Card(s) MSI RTX 2080 Ti Gaming X TRIO 11 GB| MSI RTX 3080 Ti Gaming X TRIO 12 GB
Display(s) 2X LG 27UL600 27in 4K HDR FreeSync/G-Sync DP| LG 32UL950-W 32in 4K HDR FreeSync/G-Sync DP
Case Phanteks Eclipse P500A TG| Fractal R6 TG
Audio Device(s) Creative Sound Blaster Z | Creative Sound Blaster AE-7
Power Supply Seasonic 1000 watts| Seasonic 1000 watts
Mouse Bloody P95s
Keyboard Logitech G810s
Software MS Windows 11 Pro
Unified is exactly like the Ps5 and Xbox.
One pool of memory for any use.
So apple clearly were not first and are doing something similar..
The GPU or CPU Can make memory calls in those.
Though inevitably the MMU is going to be on the edge of the soc on a buss.
1985 era Amiga 1000 has a shared memory design.
 
Joined
May 17, 2021
Messages
472 (2.29/day)
Processor Ryzen 5 3600
Motherboard B550 Elite
Cooling Hyperx 212
Memory Hyperx fury 16Gb DDR4 3333Mhz
Video Card(s) 3060 ti gaming oc pro
Storage Samsung 970 Evo 1Tb plus some HDDs
Case Lian Li Lancool II performance
Power Supply MSI 750w
Mouse G502
Go one their site, I just priced it out. $4,200 for the Max, 64GB of memory, and a 2TB drive which is about what I paid (sans discounts I can get,) for mine in my specs. That's really not bad considering what you're getting if you're comparing it to the previous 16". In that respect, Apple has kept pricing consistent, but has theoretically given it an absolutely massive performance uplift within the same power constraints.

Edit: Mind you that these are US prices in USD.

I wonder how can people seriously consider buying that crap when they know, or should at least, what they do with anti consumer BS?
 

Aquinus

Resident Wat-man
Joined
Jan 28, 2012
Messages
12,440 (3.45/day)
Location
Concord, NH
System Name Apollo
Processor Intel Core i9 9880H
Memory 64GB DDR4-2667
Video Card(s) AMD Radeon Pro 5600M, 8GB HBM2
Storage 1TB Apple NVMe, 4TB External
Display(s) Laptop @ 3072x1920 + 2x LG 5k Ultrafine TB3 displays
Case MacBook Pro (16", 2019)
Audio Device(s) AirPods Pro, Sennheiser HD 380s w/ FIIO Alpen 2, or Logitech 2.1 Speakers
Power Supply 96w Power Adapter
Mouse Logitech MX Master 3
Keyboard Full Size Wireless Apple Magic Keyboard
Software MacOS 10.15.7
I wonder how can people seriously consider buying that crap when they know, or should at least, what they do with anti consumer BS?
For being so anti-consumer, they sure do make a good machine for work and play if you can afford it.
 
Joined
Mar 18, 2008
Messages
5,710 (1.14/day)
System Name Virtual Reality / Bioinformatics
Processor Undead CPU
Motherboard Undead TUF X99
Cooling Noctua NH-D15
Memory GSkill 128GB DDR4-3000
Video Card(s) EVGA RTX 3090 FTW3 Ultra
Storage Samsung 960 Pro 1TB + 860 EVO 2TB + WD Black 5TB
Display(s) 32'' 4K Dell
Case Fractal Design R5
Audio Device(s) BOSE 2.0
Power Supply Seasonic 850watt
Mouse Logitech Master MX
Keyboard Corsair K70 Cherry MX Blue
VR HMD HTC Vive + Oculus Quest 2
Software Windows 10 P
That M1 Max GPU might be amazing for mining cryptos
 

MxPhenom 216

ASIC Engineer
Joined
Aug 31, 2010
Messages
12,779 (3.10/day)
Location
Longmont, CO
System Name Ryzen Reflection
Processor AMD Ryzen 9 5900x
Motherboard Gigabyte X570S Aorus Master
Cooling 2x EK PE360 | TechN AM4 AMD Block Black | EK Quantum Vector Trinity GPU Nickel + Plexi
Memory Teamgroup T-Force Xtreem 2x16GB B-Die 3600 @ 14-14-14-28-42-288-2T 1.45v
Video Card(s) Zotac AMP HoloBlack RTX 3080Ti 12G | 950mV 1950Mhz
Storage WD SN850 500GB (OS) | Samsung 980 Pro 1TB (Games1) | Samsung 970 Evo 1TB ( Games2)
Display(s) Dell 32" 1440p 165Hz Curved | Asus VP249QGR 144Hz IPS
Case Lian Li PC-011D XL | Custom cables by Cablemodz
Audio Device(s) Beyerdynamic TYGR 300R + Beyerdynamic FOX Mic
Power Supply Seasonic Prime Ultra Platinum 850
Mouse Razer Viper
Keyboard Razer Huntsman Tournament Edition
Software Windows 11 Pro 64-Bit
Joined
Apr 16, 2013
Messages
360 (0.11/day)
Location
Bulgaria
System Name Black Knight | White Queen
Processor Intel Core i9-10940X | Intel Core i7-5775C
Motherboard ASUS ROG Rampage VI Extreme Encore X299G | ASUS Sabertooth Z97 Mark S (White)
Cooling Noctua NH-D15 chromax.black | Xigmatek Dark Knight SD-1283 Night Hawk (White)
Memory G.SKILL Trident Z RGB 4x8GB DDR4 3600MHz CL16 | Corsair Vengeance LP 4x4GB DDR3L 1600MHz CL9 (White)
Video Card(s) KFA2/Galax GeForce GTX 1080 Ti Hall of Fame Edition | Intel Iris Pro 6200
Storage Samsung 980 Pro 1TB, 850 Pro 256GB, 840 Pro 256GB, WD Red 2/3/6TB, WD VelociRaptor 300GB/600GB/1TB
Display(s) ASUS ROG Strix XG279Q 27'', PA246Q 24'' (10 bit, CCFL, 12 bit LUT, P-IPS) | Samsung JU7500 48'' TV
Audio Device(s) ASUS Xonar Essence STX | Realtek ALC1150
Power Supply Enermax Revolution 1250W 85+ | Super Flower Leadex Gold 650W (White)
Mouse Razer Basilisk Ultimate, Razer Naga Trinity | Razer Mamba 16000
Keyboard Razer Blackwidow Chroma V2 (Orange switch) | Razer Ornata Chroma
Software Windows 10 Pro 64bit
Interesting they are still based on the A14 platform and not A15.
 
Joined
Dec 16, 2017
Messages
2,200 (1.51/day)
Location
Buenos Aires, Argentina
System Name System V
Processor AMD Ryzen 5 3600
Motherboard Asus Prime X570-P
Cooling AMD Wraith Stealth // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory 2x8GB Ballistix Sport LT 3200 MHz (BLS8G4D32AESCK.M8FE) (CL16-18-18-36)
Video Card(s) Gigabyte AORUS Radeon RX 580 8 GB
Storage SHFS37A240G / DT01ACA200 / WD20EZRX / MKNSSDTR256GB-3DL / LG BH16NS40 / ST10000VN0008
Display(s) LG 22MP55 IPS Display
Case NZXT Source 210
Audio Device(s) Logitech G430 Headset
Power Supply Corsair CX650M
Mouse Microsoft Trackball Optical 1.0
Keyboard HP Vectra VE keyboard (Part # D4950-63004)
Software Whatever build of Windows 11 is being served in Dev channel at the time.
Benchmark Scores Corona 1.3: 3120620 r/s Cinebench R20: 3355 FireStrike: 12490 TimeSpy: 4624
man does apple make me laugh with their closed limited OSes and their potato mobile processors.

Oh yeah, it's just a mobile SoC built using the most advanced node available in the world that can probably beat every other mobile SoC around singlehandedly. Nothing worthy a second of actual interest. /s

:rolleyes:
 
Joined
Apr 8, 2010
Messages
877 (0.21/day)
Processor Intel Core i5 8400
Motherboard Gigabyte Z370N-Wifi
Cooling Silverstone AR05
Memory Micron Crucial 16GB DDR4-2400
Video Card(s) Gigabyte GTX1080 G1 Gaming 8G
Storage Micron Crucial MX300 275GB
Display(s) Dell U2415
Case Silverstone RVZ02B
Power Supply Silverstone SSR-SX550
Keyboard Ducky One Red Switch
Software Windows 10 Pro 1909
Interesting they are still based on the A14 platform and not A15.
A15 based will probably be call M2 or something. I think Apple is still trying to find out what happens when these chips are scaled up
 
Joined
Dec 28, 2013
Messages
61 (0.02/day)
Oh yeah, it's just a mobile SoC built using the most advanced node available in the world that can probably beat every other mobile SoC around singlehandedly. Nothing worthy a second of actual interest. /s

:rolleyes:
proof or didn't happen
 

Fourstaff

Moderator
Staff member
Joined
Nov 29, 2009
Messages
9,830 (2.24/day)
Location
Home
System Name Orange! // ItchyHands
Processor 3570K // 10400F
Motherboard ASRock z77 Extreme4 // TUF Gaming B460M-Plus
Cooling Stock // Stock
Memory 2x4Gb 1600Mhz CL9 Corsair XMS3 // 2x8Gb 3200 Mhz XPG D41
Video Card(s) Sapphire Nitro+ RX 570 // Asus TUF RTX 2070
Storage Samsung 840 250Gb // SX8200 480GB
Display(s) LG 22EA53VQ // Philips 275M QHD
Case NZXT Phantom 410 Black/Orange // Tecware Forge M
Power Supply Corsair CXM500w // CM MWE 600w
Interesting they are still based on the A14 platform and not A15.
They probably designed this based on the A14 while another team was working on the A15.
 
Joined
Apr 12, 2013
Messages
4,532 (1.43/day)
It's a bit strange for you to bring up the Epyc/TR comparison just to then say it's not a valid comparison once people get into why this is likely to be more efficient.
That's because off hand I can't think of any other chip(s) that move such vast sums of data between massive cores, in case of Apple it's also the GPU cores now, & pay a heavy (energy) price for that. Moving (lots of) data quickly is the next big hurdle in computing & the SoC approach for now seems to be more efficient ~ the reason why it isn't directly comparable because even now the top end server chips should beat Apple in most tasks they're actually designed for but they're also generally less efficient. The SoC approach isn't really scalable beyond low double digit CPU cores especially if you're putting such a massive GPU in there!
 
Last edited:
Joined
Dec 28, 2013
Messages
61 (0.02/day)
seems everybody should drop epyc processors for their servers. they are obsolete :roll:
long time since I had such a good laugh
no wonder apple makes millions. you guys would believe about anything they say
 
Joined
Apr 12, 2013
Messages
4,532 (1.43/day)
Right, no one's saying that unless you meant some other poster?

The EPYC/TR way is meant for massive amounts of CPU cores which Apple doesn't seem to need right now. That's also in part due to the dedicated accelerators they're using for a lot of tasks. IIRC zen4 (5?) will introduce similar accelerators on die probably courtesy their Xilinx acquisition. My biggest curiosity then would be how much efficient their monolithic (APU) dies would be wrt the M1 & now M1 Pro & Max.
 
Joined
Jun 5, 2021
Messages
180 (0.96/day)
Oh yeah, it's just a mobile SoC built using the most advanced node available in the world that can probably beat every other mobile SoC around singlehandedly. Nothing worthy a second of actual interest. /s

:rolleyes:
They wouldn't beat amd on the same node thou.. zen 4 on 5nm will crush this expensive chip
 
Joined
May 2, 2017
Messages
5,497 (3.27/day)
Location
Norway, currently in Lund, Sweden
System Name Hotbox
Processor AMD Ryzen 7 5800X
Motherboard ASRock Phantom Gaming B550 ITX/ax
Cooling Aquanaut + Laing DDC 1T Plus PWM + Corsair XR5 280mm + 2x Arctic P14
Memory 32GB G.Skill FlareX 3200c14
Video Card(s) PowerColor Radeon 6900XT Liquid Devil Ultimate, UV@950mV/2050MHz/180W
Storage 2TB Adata SX8200 Pro
Display(s) Dell U2711 main, AOC 24P2C secondary
Case SSUPD Meshlicious
Audio Device(s) Optoma Nuforce μDAC 3
Power Supply Corsair SF750 Platinum
Mouse Logitech G602
Keyboard Cooler Master MasterKeys Pro M w/DSA profile caps
Software Windows 10 Pro
Or AMD is going to be doing that with Zen 4/RDNA3. The consoles APU's are custom designs, not a straight up Zen 2 design. They have features not in the deskptop APU's.
And? Unified memory isn't just a hardware feature, it's a hardware+OS feature. And there's no indication that either XSX or PS5 have truly unified memory.
What's the difference? Is the memory "truly unified" only if memory access is governed by a single MMU for both CPU and GPU?
No, it must also be accessible to the entire system without the need for copying.
I mean... its called the PS5 / XBox Series X.

I'm pretty sure they have unified memory. Hell, CUDA + CPU / OpenCL + CPU has unified memory. Its just emulated over PCIe. PS5 / XBox Series X actually have the same, literal RAM work for the iGPU side and CPU side.
It's still walled off, and needs copying, thus it isn't actually unified.
Unified is exactly like the Ps5 and Xbox.
One pool of memory for any use.
So apple clearly were not first and are doing something similar..
The GPU or CPU Can make memory calls in those.
Though inevitably the MMU is going to be on the edge of the soc on a buss.
See above. It is only truly unified if every component has full access to RAM, which is what Apple is claiming here. No PC or current x86-based platform has that.
CPUs have to transfer data to the GPUs all the time (and sometimes rarely, maybe a GPU->CPU transfer). One of the key advantages of a SOC is that this "data transfer" takes place in L3 cache instead of over system memory.

I find it hard to believe that Microsoft would design a SOC like the XBox Series X and ignore this simple and useful optimization. I see that Microsoft is playing cute games with its 10+6 GB layout, but I'm pretty sure they're just saying that CPUs use less memory bandwidth, so 10GB of fast-RAM + 6GB of slow-RAM is intended for the CPU to use slow-RAM and GPU to use fast-RAM. But both CPU+GPU should have access to both halfs.

If for no other reason than to optimize the "no copy" methodology between CPU -> GPU data transfers. (Why ever copy data when GPUs can simply just read the RAM themselves?). In dGPU world, you need to transfer the data over PCIe because the VRAM is physically a different chip. But in XBox Series X land, VRAM and RAM are literally the same chips, no copying needed.
But copying is needed for those, as the CPU and GPU have discrete areas of memory set aside for them.
Isn't that the case with every Intel and AMD processor with integrated graphics? At least since Haswell for Intel (AnandTech) and since Kaveri for AMD (Wikipedia).
No, iGPUs have system memory set aside for them - some static, some dynamic. This memory is not accessible to the CPU, and regular system memory is not accessible to the iGPU, necessitating copying data between the two.
Anandtech is speculating it’s probably 64MB on the Max, 32MB on the Pro. They are looking at the actual die shots (provided in the presentation, interestingly), not the illustrative diagram Apple used in the presentation.
That's lower than I would have expected, but then diagrams are always misleading. I wonder if that judgement is correct though, as the new SLC blocks look much bigger than on the M1, which had 16MB. On the M1 the SLC block is slightly larger than two GPU "cores", on the M1P/M it's larger than four. Of course, not all of this is actually cache, and a lot of it is likely interconnects and other stuff, but 2x16MB still seems low to me.
Yeah, its not a new feature at all.



But as Wirko has pointed out: this isn't new at all. Intel / AMD chips have been doing zero-copy transfers on Windows for nearly a decade now on its iGPUs.

Yes, that is even on Windows 10, which is HyperV virtualized for security purposes. (The most secure parts of Windows start up in a separate VM these days, so that not even a kernel-level hack can reach those secrets... unless it also includes a VM-break of some kind)

Now don't get me wrong: XBox Series X has a weird / complicated memory scheme going on. But I'd still expect that this extremely strange memory scheme was unified, much akin to AMD's Kaveri or Intel iGPU stuffs that you'd find on any typical iGPU for the past decade.
It clearly isn't, when they wall off sections of RAM for the OS, CPU software and GPU software. Discrete memory regions implies that copying is needed between them, which means it isn't unified.
The M1 Max, at least on paper, makes every other CPU seem like a decade out of date... How can this be?
Money, mainly. Apple can afford to outspend everyone on R&D, by a huge margin.
1. For games, the shared memory usage is relatively minor. PC has reBar resize that enabled PC CPU to directly access the entire GPU's VRAM. CPU wouldn't be able to keep up with dGPU's large-scale scatter-gather capability.

2. Shared memory has its downsides with context switch overheads. CPU IO access can gimp GPU's burst mode IO access e.g. frame buffer burst IO access shouldn't be disturbed.

Late 1980s Amiga's Chip Ram is shared memory between the CPU and iGPU (custom chips).
ReBAR doesn't have anything to do with this - it allows the CPU to write to the entire VRAM rather than smaller chunks, but the CPU still can't work off of VRAM - it needs copying to system RAM for the CPU to work on it. You're right that shared memory has its downsides, but with many times the bandwidth of any x86 CPU (and equal to many dGPUs) I doubt that will be a problem, especially considering Apple's penchant for massive caches.
That's because off hand I can't think of any other chip(s) that move such vast sums of data between massive cores, in case of Apple it's also the GPU cores now, & pay a heavy (energy) price for that. Moving (lots of) data quickly is the next big hurdle in computing & the SoC approach for now seems to be more efficient ~ the reason why it isn't directly comparable because even now the top end server chips should beat Apple in most tasks they're actually designed for but they're also generally less efficient. The SoC approach isn't really scalable beyond low double digit CPU cores especially if you're putting such a massive GPU in there!
Yes, but that's precisely why pointing out that the M1P/M are monolithic allows for huge power savings as they don't need off-die interfaces for most of this. Keeping data on silicon is a massive power savings. Of course they're working with 10 (8+2) CPU cores and an 8-"core" GPU, not a 32-64-core CPU, so the interfaces can also be much, much simpler.
They wouldn't beat amd on the same node thou.. zen 4 on 5nm will crush this expensive chip
That's debatable. Apple's architecture team is doing some incredible work the past years. Their cache architecture (which is something that doesn't gain that much from node changes) is far superior to anything else (look at the cache access benchmarks in the AnandTech article I linked above), and their huge CPU cores have a >50% IPC lead over both Intel and AMD, matching their performance at much lower clocks (in part thanks to those huge, low-latency caches, but not only that). A higher core count chip from AMD will still likely win in a 100% MT workload, but the power difference is likely to be significant.
proof or didn't happen
Here's Anandtech's SPEC2006 and SPEC2017 testing of the M1. Those are industry standard benchmarks for ST performance, and the M1 rivals the 5950X at a fraction of the power, and much lower clocks. These chips use the same architecture but with more cache, more RAM, and a much higher power budget.
 
Top