The GPU Supply Chain: From Silicon Foundry to Server Rack

[01]

Where It Starts: TSMC

Every NVIDIA GPU;from the Volta generation through Blackwell;is manufactured by Taiwan Semiconductor Manufacturing Company (TSMC). NVIDIA designs the chip but does not manufacture it. This fabless model separates design from production, allowing NVIDIA to focus on architecture while TSMC operates the most advanced semiconductor fabrication facilities in the world.

The H100 uses TSMC's 4N process (a customised variant of 4nm). The B200 uses TSMC's 4NP process. Manufacturing at these nodes requires equipment that only three companies produce: ASML (extreme ultraviolet lithography machines), Tokyo Electron (deposition and etch equipment), and Applied Materials (thin-film deposition).

TSMC's dependence on ASML's EUV machines;each costing £150M-£200M and requiring 40 shipping containers to deliver;means the advanced GPU supply chain traces back to a single Dutch company's production capacity. TSMC manufactures GPUs at its facilities in Hsinchu and Tainan, Taiwan. This geographic concentration;Taiwan is approximately 160km from mainland China;is the single largest geopolitical risk in the AI supply chain.

[02]

Memory: SK Hynix, Samsung, and Micron

A GPU without memory is useless. High Bandwidth Memory (HBM);the stacked DRAM that sits on current-generation GPU packages;is manufactured by three companies: SK Hynix, Samsung, and Micron. SK Hynix supplied the HBM2e for the H100 and supplies HBM3e for the B200.

This near-monopoly on advanced memory supply has created its own bottleneck. During peak H100 demand in 2023, HBM supply constrained NVIDIA's shipments more than wafer capacity. NVIDIA could produce GPU dies faster than SK Hynix could supply HBM stacks.

HBM production requires extreme precision;stacking multiple DRAM dies vertically using through-silicon vias, then connecting them to the GPU die via a silicon interposer. Capacity expansion requires 18-24 months of lead time. The B200 uses 192GB of HBM3e across eight stacks. Memory availability;not compute die availability;has historically been the binding constraint in GPU supply.

[03]

Advanced Packaging: CoWoS and the Bottleneck

NVIDIA GPUs are not a single die. They are a package containing the GPU die, HBM memory stacks, and an interposer;a layer of silicon that connects them.

TSMC's CoWoS (Chip on Wafer on Substrate) packaging process is the dominant approach for this integration. CoWoS capacity is distinct from wafer fabrication capacity.

Even if TSMC has ample 4nm wafer capacity, CoWoS packaging can independently constrain output. During 2023, CoWoS capacity was the primary bottleneck limiting H100 shipments;TSMC had dies and HBM was available, but could not package them fast enough. This illustrates a structural feature of the GPU supply chain: there are multiple independent chokepoints, each of which can independently constrain production. As of Q1 2026, CoWoS capacity is more balanced with demand, but any demand acceleration could recreate the bottleneck quickly.

[04]

System Integrators: Dell, Supermicro, HPE

Once NVIDIA ships GPUs to system integrators, the next supply chain layer produces complete servers. Supermicro, Dell, and Hewlett Packard Enterprise build HGX-based servers by combining NVIDIA's GPU baseboards with their own server chassis, cooling systems, power supplies, networking, and storage.

Supermicro;a San Jose-based company with manufacturing in the Netherlands and Taiwan;is the largest integrator by volume. Lead times for complete servers;from GPU availability through system integration and testing;typically run 8-16 weeks. During supply-constrained periods, lead times for complete HGX systems have extended to 6-9 months.

[05]

The Last Mile: Logistics and Installation

A complete HGX B200 server weighs approximately 120kg and requires specialised handling;vibration-sensitive components, electrostatic discharge precautions, and careful thermal management during transport. A 64-server cluster installation takes an experienced team 2-3 weeks. Any component failure during installation requires the failed component to return to the vendor, adding days or weeks to schedule.

End-to-end, from procurement decision to running workloads, a production GPU cluster deployment typically requires 3-6 months. Planning this timeline is fundamental to capital deployment decisions. For procurement support and vendor introductions across the GPU supply chain, contact Disintermediate at disintermediate.global/contact.

Key Takeaways

TSMC manufactures all advanced NVIDIA GPUs; its geographic concentration in Taiwan is the dominant geopolitical risk in the AI supply chain

HBM memory (SK Hynix dominant) has historically been the binding supply constraint;not GPU die production

CoWoS advanced packaging is a third independent bottleneck, separate from wafer fabrication and memory supply

System integration (Supermicro, Dell, HPE) adds 8-16 weeks to lead times; HGX B200 servers weigh 120kg and require specialised handling

End-to-end from procurement decision to running workloads: 3-6 months for a production cluster deployment

Where It Starts: TSMC

Memory: SK Hynix, Samsung, and Micron

Advanced Packaging: CoWoS and the Bottleneck

System Integrators: Dell, Supermicro, HPE

The Last Mile: Logistics and Installation

GPU Procurement & Capex Benchmarking

GPU Cloud Market Structure

What Is a GPU and Why Does AI Need Them?

Export Controls and the Geopolitics of GPU Supply