Operations

Capacity Planning

Definition

Capacity planning is the discipline of forecasting GPU compute demand and aligning infrastructure procurement, deployment, and expansion accordingly. In GPU infrastructure, capacity planning is complicated by long hardware lead times (3-12 months for GPU servers), rapid technology transitions (18-24 month generational cycles), and volatile demand patterns. Over-provisioning ties up capital in depreciating assets; under-provisioning loses revenue and customers. Effective capacity planning requires real-time demand visibility, procurement pipeline management, and scenario modelling across multiple GPU generations.

Technical Context

Key variables in GPU capacity planning include: GPU procurement lead times, data centre power availability, cooling infrastructure readiness, network equipment availability (InfiniBand switches are often the binding constraint), and customer commitment pipelines. The planning horizon must account for GPU generational transitions — operators must decide when to invest in current-gen hardware versus waiting for next-gen. Too early and assets depreciate faster; too late and competitors capture demand.

Advisory Relevance

Capacity planning assumptions are a critical evaluation point in due diligence. Management teams that assume 100% utilisation from day one or instantaneous capacity expansion are presenting unrealistic projections. We benchmark capacity ramp assumptions against observed operator performance — typically 70-90% utilisation over 12-18 months for new deployments.

This glossary is maintained by Disintermediate as a reference for GPU infrastructure professionals, investors, and operators. Each entry reflects terminology as used in active advisory engagements and market intelligence work.

View all terms Discuss this topic