Networking

Network Bandwidth

Definition

Network bandwidth is the maximum data transfer rate of a network connection, measured in gigabits per second (Gb/s) or terabits per second (Tb/s). In GPU infrastructure, bandwidth is the binding constraint on distributed training performance. A single NVIDIA ConnectX-7 adapter provides 400 Gb/s (NDR InfiniBand), and each GPU node typically has 8 such adapters — one per GPU — providing 3.2 Tb/s of aggregate node bandwidth. The gap between theoretical and achieved bandwidth determines the practical scaling limits of a cluster.

Technical Context

Effective bandwidth depends on message size, protocol overhead, network congestion, and software stack efficiency. Small messages (< 1 MB) achieve a fraction of peak bandwidth due to protocol overhead and latency dominance. Large messages (> 100 MB) can approach line rate. Training workloads generate gradient messages whose size depends on model architecture — transformer models with billions of parameters produce large gradient messages that achieve high bandwidth utilisation. Achieved bandwidth is typically reported as a percentage of theoretical maximum — 85-95% is considered excellent.

Advisory Relevance

Bandwidth specifications directly affect cluster pricing and performance. We benchmark operator claims about network performance against measured all-reduce throughput data to validate infrastructure quality.

This glossary is maintained by Disintermediate as a reference for GPU infrastructure professionals, investors, and operators. Each entry reflects terminology as used in active advisory engagements and market intelligence work.

View all terms Discuss this topic