Samsung Reveals GDDR7 Memory Uses PAM3 Signalling to Achieve 36 Gbps Data-Rate

btarunr · Dec 5, 2022

The next-generation GDDR7 memory standard is shaping up nicely, to double bandwidth and density over the current GDDR6. In a company presentation detailing upcoming memory technologies, Samsung revealed that GDDR7 uses PAM3 signalling. While ones and zeroes are stored in DRAM memory cells, it is transmitted between devices (such as the DRAM chip and the GPU) in electrical waveforms known as "signals." Ones and zeroes are interpreted by patterns in the signal waveform.

Conventional GDDR6 memory uses NRZ (non-return to zero) or PAM2 signalling to achieve data-rates starting from 14 Gbps, with 24 Gbps expected to be the fastest production GDDR6 memory speed on offer, however some of the faster GDDR6 speeds such as 18 Gbps, 20 Gbps, and 22 Gbps couldn't hit production soon enough for the development phase of the GeForce RTX 30-series "Ampere" GPU, and so NVIDIA and Micron Technology co-developed the GDDR6X standard leveraging PAM4 signalling, to offer speeds ranging between 18 Gbps to 23 Gbps (or higher) several quarters ahead of this faster JEDEC-standard GDDR6.

Conventional NRZ signalling provides 1 bit per cycle transmission rate, while PAM4 does 2 bits per cycle. PAM3 increases this to 3 bits per cycle using a more advanced waveform with many more "eyes" (gaps created by intersections of waves that are interpreted as bits). Samsung states that PAM3 is 25% more efficient than NRZ signalling, and that GDDR7 will be 25% more energy efficient. PAM3 signalling is also used by the 80 Gbps per-direction Thunderbolt 4 standard, and the upcoming USB4.

As for performance, the Samsung slide references 36 Gbps data-rate, which confirms that GDDR7 will bring a generational doubling in data-rates over GDDR6, much like GDDR6 did over GDDR5. A typical GPU with a 256-bit memory bus, when using 36 Gbps-rated GDDR7 memory, will enjoy 1152 GB/s of memory bandwidth. High-end GPUs with 384-bit memory interfaces will do 1728 GB/s. Mainstream GPUs with 128-bit interfaces get 576 GB/s on tap.

View at TechPowerUp Main Site | Source

Minus Infinity · Dec 5, 2022

Why didn't they go straight for GDDR7W given the improvement they showed with the just announced GDDR6W. This stacked method seems it should be the default configuration now.

HisDivineOrder · Dec 5, 2022

Can't wait till Nvidia uses this memory to help justify launching a x050 card as a $1k x080 card in name only.

dir_d · Dec 5, 2022

So PAM3>PAM4

Nanochip · Dec 5, 2022

The forward march of technology soldiers on. Soon, even the mighty 4090 will be obsolete and nothing more than an once expensive relic, a power-hungry dinosaur.

watzupken · Dec 5, 2022

I am not sure which existing GDDR6 solution is being used to compare with the GDDR7 36Gbps data rate to derive that 25% efficiency, but I am expecting a big jump in power requirements, just not proportionate to the increase in data rate.

R-T-B · Dec 5, 2022

dir_d said:
So PAM3>PAM4

Technically speaking, no, but it's what they went with.

watzupken said:
I am not sure which existing GDDR6 solution is being used to compare with the GDDR7 36Gbps data rate to derive that 25% efficiency, but I am expecting a big jump in power requirements, just not proportionate to the increase in data rate.

That's the fate of all technology.

mechtech · Dec 5, 2022

mmmmmmmmmmmmmmmmmmmmm.............................

Bandwidths

errrrr

Donuts......

watzupken · Dec 5, 2022

Nanochip said:
The forward march of technology soldiers on. Soon, even the mighty 4090 will be obsolete and nothing more than an once expensive relic, a power-hungry dinosaur.

I feel that flagship GPUs have always been very power hungry given that they exist to push performance boundaries. In any case, I think we are already hitting a point where transistors are not shrinking fast enough to make more complex chips. So very likely, it will be increasingly common to see bigger chips with higher power draw. The RTX 4090 is an exception this time because the jump from Samsung's 8nm (basically a refined 10nm), to TSMC's 4nm (a refined 5nm) is a very significant improvement. I would assume even if Nvidia is to continue using TSMC's 3nm for their next gen GPUs, I don't think we will see such a big jump in performance. With VRAM getting faster, I suspect other than the halo products, every other product range will start using narrower memory bus and let the memory pick up the slack.

The King · Dec 5, 2022

Minus Infinity said:
Why didn't they go straight for GDDR7W given the improvement they showed with the just announced GDDR6W. This stacked method seems it should be the default configuration now.

The answer I believe is that R&D cost money and they need to recover those costs.

It's not like any of these companies do these things just to bring the fastest technology available without anyone footing the bill. Profits need to be made.

Dirt Chip · Dec 5, 2022

Ready from your favorite high-end GPU brand at overpriced $$$ starting late 2024.
24-32GB of those will cost like a complete midrange GPU.
Progress is magic.

Minus Infinity · Dec 5, 2022

The King said:
The answer I believe is that R&D cost money and they need to recover those costs.

It's not like any of these companies do these things just to bring the fastest technology available without anyone footing the bill. Profits need to be made.

Yes but the hard work and R&D has been done, it shouldn't be as big of a jump for GDDR7 one would presume.

R-T-B · Dec 5, 2022

You guys are overthinking this. GDDR6W requires a wider memory bus for the performance uptick to apply. It's why you won't see it's wide adoption.

Kenjiro · Dec 5, 2022

PAM3 is not 3-bit, but 3 level signals, which could be used in encoding using base-3 systems, i.e. sending values of 0-15 as 3 signals, not 4 in binary.
Look at: https://en.wikipedia.org/wiki/Ternary_numeral_system

Wirko · Dec 5, 2022

Minus Infinity said:
Why didn't they go straight for GDDR7W given the improvement they showed with the just announced GDDR6W. This stacked method seems it should be the default configuration now.

Stacking is routinely employed to make system RAM so technology has long been available. But power density is low there. Heat might be the reason stacking isn't more common in VRAM.

Lianna · Dec 5, 2022

Kenjiro said:
PAM3 is not 3-bit, but 3 level signals, which could be used in encoding using base-3 systems, i.e. sending values of 0-15 as 3 signals, not 4 in binary.
Look at: https://en.wikipedia.org/wiki/Ternary_numeral_system

Yes, that's what the third picture shows. PAM3 makes it possible to send 3 bits in 2 signals or 11 bits in 7 signals, so roughly 1.5x or 1.57x speedup (theoretical just over 1.58), matching claim of 36Gbps vs. max 24 Gbps in current GDDR6.

Edit: @btarunr could you please update/correct the second sentence of the third paragraph?

Assimilator · Dec 5, 2022

R-T-B said:
You guys are overthinking this. GDDR5W requires a wider memory bus for the performance uptick to apply. It's why you won't see it's wide adoption.

You mean GDDR6W, right?

R-T-B · Dec 5, 2022

Assimilator said:
You mean GDDR6W, right?

Yes, typo.

trsttte · Dec 5, 2022

Aren't we getting to the point it would be better to just move to HBM? Signal integrity and power will be an issue, at some point it must just become simpler and cost effective to pay for the more advanced packaging solutions, AMD is already using chiplets after all.

Assimilator · Dec 6, 2022

trsttte said:
Aren't we getting to the point it would be better to just move to HBM? Signal integrity and power will be an issue, at some point it must just become simpler and cost effective to pay for the more advanced packaging solutions, AMD is already using chiplets after all.

If it was that easy, don't you think it would already have happened? Maybe because... it's not that easy. Unless you think that the power and heat requirements of extremely high-bandwidth memory can just be handwaved away?

ArdWar · Dec 6, 2022

PAM3 is exactly log2(3) bits per cycle, which is about 1.585 bits. Definitely not 3 bits...

Interestingly PAM3 miss by just a hair on being able to use 83 cycle on the usual 128b/130b line code

trsttte · Dec 6, 2022

Assimilator said:
If it was that easy, don't you think it would already have happened? Maybe because... it's not that easy. Unless you think that the power and heat requirements of extremely high-bandwidth memory can just be handwaved away?

It uses less power than GDDR and thus less heat, though heat is still a problem because of the proximity to the main compute die. Cost is also a problem but here we're at a chicken and egg problem: price won't decrease if no one uses it.

Packaging is also much more expensive, but power consumption with GDDR7 will certainly increase again to meet the increased difficulty in transmiting high bw signals to the compute die, at some point (if not already - more and more aplications are starting to use it as well) we'll cross the line where it becomes the better option

Assimilator · Dec 6, 2022

trsttte said:
Packaging is also much more expensive, but power consumption with GDDR7 will certainly increase again to meet the increased difficulty in transmiting high bw signals to the compute die, at some point (if not already - more and more aplications are starting to use it as well) we'll cross the line where it becomes the better option

You're also ignoring the fact that die stacking/interconnection requires every single one of the dies involved to be without defects. If even one of them doesn't work, you throw all of them away. You don't have that problem with discrete memory chips, which makes production a lot less wasteful, which is a major concern at a time when every die from a leading-edge node is becoming more and more expensive. This is exactly why AMD discontinued the use of HBM in consumer GPUs after only one generation.

trsttte · Dec 6, 2022

Assimilator said:
You're also ignoring the fact that die stacking/interconnection requires every single one of the dies involved to be without defects. If even one of them doesn't work, you throw all of them away. You don't have that problem with discrete memory chips, which makes production a lot less wasteful, which is a major concern at a time when every die from a leading-edge node is becoming more and more expensive. This is exactly why AMD discontinued the use of HBM in consumer GPUs after only one generation.

It wasn't just one generation, it was two (splitting hairs, I know). Memory doesn't generally use leading-edge nodes but cost was and probably still is a problem, but I'm not saying this is something that will happen tomorrow and it's not like GDDR6X and GDDR7 are cheap either. All I'm saying is we're getting to that point

System Name	RBMK-1000
Processor	AMD Ryzen 7 5700G
Motherboard	Gigabyte B550 AORUS Elite V2
Cooling	DeepCool Gammax L240 V2
Memory	2x 16GB DDR4-3200
Video Card(s)	Galax RTX 4070 Ti EX
Storage	Samsung 990 1TB
Display(s)	BenQ 1440p 60 Hz 27-inch
Case	Corsair Carbide 100R
Audio Device(s)	ASUS SupremeFX S1220A
Power Supply	Cooler Master MWE Gold 650W
Mouse	ASUS ROG Strix Impact
Keyboard	Gamdias Hermes E2
Software	Windows 11 Pro

System Name	4k
Processor	AMD 5800x3D
Motherboard	MSI MAG b550m Mortar Wifi
Cooling	ARCTIC Liquid Freezer II 240
Memory	4x8Gb Crucial Ballistix 3600 CL16 bl8g36c16u4b.m8fe1
Video Card(s)	Nvidia Reference 3080Ti
Storage	ADATA XPG SX8200 Pro 1TB
Display(s)	LG 48" C1
Case	CORSAIR Carbide AIR 240 Micro-ATX
Audio Device(s)	Asus Xonar STX
Power Supply	EVGA SuperNOVA 650W
Software	Microsoft Windows10 Pro x64

System Name	Pioneer
Processor	Ryzen 9 9950X
Motherboard	MSI MAG X670E Tomahawk Wifi
Cooling	Noctua NH-D15 + A whole lotta Sunon, Phanteks and Corsair Maglev blower fans...
Memory	64GB (2x 32GB) G.Skill Flare X5 @ DDR5-6200(Running 1T no GDM)
Video Card(s)	XFX RX 7900 XTX Speedster Merc 310
Storage	Intel 5800X Optane 800GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs, 1x 2TB Seagate Exos 3.5"
Display(s)	55" LG 55" B9 OLED 4K Display
Case	Thermaltake Core X31
Audio Device(s)	TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply	FSP Hydro Ti Pro 850W
Mouse	Logitech G305 Lightspeed Wireless
Keyboard	WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software	Gentoo Linux x64, other office machines run Windows 11 Enterprise

Processor	Ryzen 5700x
Motherboard	Gigabyte X570S Aero G R1.1 Bios F7g
Cooling	Noctua NH-C12P SE14 w/ NF-A15 HS-PWM Fan 1500rpm
Memory	Micron DDR4-3200 2x32GB D.S. D.R. (CT2K32G4DFD832A)
Video Card(s)	AMD RX 6800 - Asus Tuf
Storage	Kingston KC3000 1TB & 2TB & 4TB Corsair MP600 Pro LPX
Display(s)	LG 27UL550-W (27" 4k)
Case	Be Quiet Pure Base 600 (no window)
Audio Device(s)	Realtek ALC1220-VB
Power Supply	SuperFlower Leadex V Gold Pro 850W ATX Ver2.52
Mouse	Mionix Naos Pro
Keyboard	Corsair Strafe with browns
Software	W10 22H2 Pro x64

Processor	AMD R7 1700X @ 4100Mhz
Motherboard	MSI B450M MORTAR MAX (MS-7B89)
Cooling	Phanteks PH-TC14PE
Memory	Crucial Technology 16GB DR (DDR4-3600) - C9BLM:045M:E BL16G36C16U4W.M16FE1 X2 @ CL14
Video Card(s)	XFX RX480 GTR 8GB @ 1408Mhz (AMD Auto OC)
Storage	Samsung SSD 850 EVO 250GB
Display(s)	Acer KG271 1080p @ 81Hz
Power Supply	SuperFlower Leadex II 750W 80+ Gold
Keyboard	Redragon Devarajas RGB
Software	Microsoft Windows 10 (10.0) Professional 64-bit
Benchmark Scores	https://valid.x86.fr/mvvj3a

System Name	Dirt Sheep \| Silent Sheep
Processor	i5-2400 \| 13900K (-0.02mV offset)
Motherboard	Asus P8H67-M LE \| Gigabyte AERO Z690-G, bios F29 Intel baseline
Cooling	Scythe Katana Type 1 \| Noctua NH-U12A chromax.black
Memory	G-skill 28GB DDR3 \| Corsair Vengeance 432GB DDR5 5200Mhz C40 @4000MHz
Video Card(s)	iGPU \| NV 1080TI FE
Storage	Micron 256GB SSD \| 2SN850 1TB, 230S 4TB, 840EVO 128GB, IronWolf 6TB, 2HC550 18TB in RAID1
Display(s)	LG 21` FHD W2261VP \| Lenovo 27` 4K Qreator 27
Case	Thermaltake V3 Black\|Define 7 Solid: 2TOUGHFAN 14Pro+2Stock 14 inlet, NF-A14 PPC-3000+NF-A8 outlet
Audio Device(s)	Beyerdynamic DT 990 (or the screen speakers when I'm too lazy)
Power Supply	Enermax Pro82+ 525W \| Corsair RM650x (2021)
Mouse	Logitech Master 3
Keyboard	Roccat Isku FX
VR HMD	Nop.
Software	WIN 10 \| WIN 11
Benchmark Scores	CB23 SC: i5-2400=641 \| i9-13900k=2281 MC: i5-2400=i9 13900k SC \| i9-13900k=35500

Processor	i5-6600K
Motherboard	Asus Z170A
Cooling	some cheap Cooler Master Hyper 103 or similar
Memory	16GB DDR4-2400
Video Card(s)	IGP
Storage	Samsung 850 EVO 250GB
Display(s)	2x Oldell 24" 1920x1200
Case	Bitfenix Nova white windowless non-mesh
Audio Device(s)	E-mu 1212m PCI
Power Supply	Seasonic G-360
Mouse	Logitech Marble trackball, never had a mouse
Keyboard	Key Tronic KT2000, no Win key because 1994
Software	Oldwin

System Name	Firelance.
Processor	Threadripper 3960X
Motherboard	ROG Strix TRX40-E Gaming
Cooling	IceGem 360 + 6x Arctic Cooling P12
Memory	8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s)	MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage	2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s)	Dell S3221QS(A) (32" 38x21 60Hz) + 2x AOC Q32E2N (32" 25x14 75Hz)
Case	Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply	Fractal Design Ion+ 2 Platinum 760W
Mouse	Logitech G604
Keyboard	Razer Pro Type Ultra
Software	Windows 10 Professional x64

System Name	Asus X450JB
Processor	Intel Core i7-4720HQ
Motherboard	Asus
Memory	2x 4GiB
Video Card(s)	nVidia GT940M
Storage	2x 1TB

Samsung Reveals GDDR7 Memory Uses PAM3 Signalling to Achieve 36 Gbps Data-Rate

Editor & Senior Moderator