- Joined
- Aug 19, 2017
- Messages
- 3,146 (1.10/day)
In a joint briefing this week, KAIST's Memory Systems Laboratory and TERA's Interconnection and Packaging group presented a forward-looking roadmap for High Bandwidth Memory (HBM) standards and the accelerator platforms that will employ them. Shared via Wccftech and VideoCardz, the outline covers five successive generations, from HBM4 to HBM8, each promising substantial gains in capacity, bandwidth, and packaging sophistication. First up is HBM4, targeted for a 2026 rollout in AI GPUs and data center accelerators. It will deliver approximately 2 TB/s per stack at an 8 Gbps pin rate over a 2,048-bit interface. Die stacks will reach 12 to 16 layers, yielding 36-48 GB per package with a 75 W power envelope. NVIDIA's upcoming Rubin series and AMD's Instinct MI500 cards are slated to employ HBM4, with Rubin Ultra doubling the number of memory stacks from eight to sixteen and AMD targeting up to 432 GB per device.
Looking to 2029, HBM5 maintains an 8 Gbps speed but doubles the I/O lanes to 4,096 bits, boosting throughput to 4 TB/s per stack. Power rises to 100 W and capacity scales to 80 GB using 16‑high stacks of 40 Gb dies. NVIDIA's tentative Feynman accelerator is expected to be the first HBM5 adopter, packing 400-500 GB of memory into a multi-die package and drawing more than 4,400 W of total power. By 2032, HBM6 will double pin speeds to 16 Gbps and increase bandwidth to 8 TB/s over 4,096 lanes. Stack heights can grow to 20 layers, supporting up to 120 GB per stack at 120 W. Immersion cooling and bumpless copper-copper bonding will become the norm. The roadmap then predicts HBM7 in 2035, which includes 24 Gbps speeds, 8,192-bit interfaces, 24 TB/s throughput, and up to 192 GB per stack at 160 W. NVIDIA is preparing a 15,360 W accelerator to accommodate this monstrous memory.
Last but not least is the HBM8 by 2038 with 32 Gbps, 16,384 lanes, 64 TB/s, 200-240 GB stacks, and embedded cooling at 180 W. Beyond pure DRAM, the briefing also teased hybrid HBM-HBF (High-Bandwidth Flash) architectures to meet the demands of large-scale AI inference, pairing NAND-based flash with DRAM stacks through ultra-fast through-silicon vias. With accelerators evolving toward petabyte‑class bandwidth and multi‑chiplet designs, these emerging HBM and HBF standards are the only way forward.
View at TechPowerUp Main Site | Source
Looking to 2029, HBM5 maintains an 8 Gbps speed but doubles the I/O lanes to 4,096 bits, boosting throughput to 4 TB/s per stack. Power rises to 100 W and capacity scales to 80 GB using 16‑high stacks of 40 Gb dies. NVIDIA's tentative Feynman accelerator is expected to be the first HBM5 adopter, packing 400-500 GB of memory into a multi-die package and drawing more than 4,400 W of total power. By 2032, HBM6 will double pin speeds to 16 Gbps and increase bandwidth to 8 TB/s over 4,096 lanes. Stack heights can grow to 20 layers, supporting up to 120 GB per stack at 120 W. Immersion cooling and bumpless copper-copper bonding will become the norm. The roadmap then predicts HBM7 in 2035, which includes 24 Gbps speeds, 8,192-bit interfaces, 24 TB/s throughput, and up to 192 GB per stack at 160 W. NVIDIA is preparing a 15,360 W accelerator to accommodate this monstrous memory.



Last but not least is the HBM8 by 2038 with 32 Gbps, 16,384 lanes, 64 TB/s, 200-240 GB stacks, and embedded cooling at 180 W. Beyond pure DRAM, the briefing also teased hybrid HBM-HBF (High-Bandwidth Flash) architectures to meet the demands of large-scale AI inference, pairing NAND-based flash with DRAM stacks through ultra-fast through-silicon vias. With accelerators evolving toward petabyte‑class bandwidth and multi‑chiplet designs, these emerging HBM and HBF standards are the only way forward.
View at TechPowerUp Main Site | Source