Hardware

GPU Memory (VRAM)

Definition

GPU memory (VRAM, specifically HBM — High Bandwidth Memory) is the on-chip memory that stores model parameters, activations, and gradients during computation. VRAM capacity determines the maximum model size that can fit on a single GPU, while HBM bandwidth determines how fast the GPU can access that data. The NVIDIA B200 provides 192 GB of HBM3e memory at 8 TB/s bandwidth. The AMD MI300X offers 192 GB of HBM3 at 5.3 TB/s. HBM manufacturing capacity is currently a binding constraint on GPU production, with shortages extending into 2027 per Samsung and SK Hynix.

Technical Context

HBM is manufactured by stacking multiple DRAM dies vertically and connecting them with through-silicon vias (TSVs). Only three companies produce HBM: Samsung, SK Hynix, and Micron. SK Hynix holds approximately 50% market share. HBM3e (the current generation) provides 36 GB per stack at 1.2 TB/s. Each B200 GPU uses 6 HBM3e stacks. The transition to HBM4 in 2026-2027 will increase per-stack capacity and bandwidth but requires new packaging technology.

Advisory Relevance

HBM supply constraints directly affect GPU procurement timelines and pricing. We track the HBM supply chain as part of our deployment advisory — understanding which GPU configurations are available and when is essential for realistic capacity planning.

This glossary is maintained by Disintermediate as a reference for GPU infrastructure professionals, investors, and operators. Each entry reflects terminology as used in active advisory engagements and market intelligence work.

View all termsDiscuss this topic