Storage as a Revenue Driver in GPU Infrastructure

[01]

Storage Economics & Margin Structure

GPU cloud revenue comes from compute (per-GPU-hour), storage (per-GB-month persistent, per-GB ingress/egress), and networking (per-GB transfer). Compute margin is 30-40% (compressed by power cost); storage margin is 60-85% because marginal cost is minimal and pricing power is high. Storage pricing: ingress free or $0.01-$0.03/GB (minimal cost, often waived for lock-in); persistent storage $0.10-$0.30/GB-month (cost $0.02-$0.08, margin 60-80%); egress $0.10-$0.50/GB (cost $0.02-$0.08, margin 80-85%).

AWS S3 egress $0.09/GB, Azure blob $0.087/GB; CoreWeave and Lambda custom-quote $0.10-$0.30/GB based on volume. Inference dominates storage economics. Serving Llama 70B with 1M requests/day (2,048 input + 512 output tokens = 1.1MB per request) requires 33TB/month egress = $3.3-6.6K/month at $0.10-$0.20/GB.

GPU compute costs $56K/month at $4.00/hour with 50ms latency, so storage is 6-12% of total API cost and scales linearly with output tokens. Data gravity locks customers in. A customer with 100TB training dataset in CoreWeave faces $10-20K egress to migrate ($100TB × $0.10-$0.20/GB). Competitors must offer >$50K compute discount to offset, creating switching cost moat.

[02]

Storage Tiers & Inference Workload Architecture

Tiered architecture optimises cost and access latency: hot tier (NVMe, $0.50-$1.00/GB-month, 1-10TB, local to GPUs), warm tier (network SSD, $0.20-$0.40/GB-month, 10-100TB, cached datasets), cold tier (HDD object store, $0.05-$0.15/GB-month, 100TB-1PB, archives and unloaded data). Inference keeps model weights in hot (Llama 70B = 140GB), context in warm (staged from object), outputs in cold before egress.

NVMe cost is $0.05-$0.15/GB-month; pricing $0.50-$1.00 yields 70-95% margin. Network SSD: cost $0.08-$0.20, pricing $0.20-$0.40 yields 50-75%.

Object HDD: cost $0.02-$0.08, pricing $0.05-$0.15 yields 60-75%. All storage tiers beat compute margin (60-75% vs. 30-40%). Operators offer low persistent storage pricing ($0.15-$0.20/GB-month below AWS to lock in via data gravity) whilst maintaining high egress margins ($0.20-$0.30/GB, above AWS but below custom).

[03]

Egress & Data Gravity as Lock-in Mechanism

Egress fees are the primary lock-in mechanism. A customer storing 10TB training dataset + 5TB inference cache in CoreWeave faces $1.5-3K egress to migrate ($15TB × $0.10-$0.20/GB). At $100K annual compute spend, egress is 1.5-3% of cost; material enough to avoid.

At $500K-$1M spend, $1.5-3K is negligible, but 10-20 datasets aggregate to $15-60K switching cost. Operational lock-in is more potent. Local inference cache storage costs $0.20/GB-month ($20/month for 100GB); retrieving from external bucket repeatedly costs $0.10-$0.30/GB per retrieval.

For 1M monthly API requests requiring 100MB context retrieval = 100TB/month egress = $10-30K/month. Local caching is 50-150x cheaper. Operators execute deliberate strategy: low compute pricing to acquire customers, then lock via high-margin storage (60-80%) and egress. Lambda uses this to own developers; CoreWeave bundles storage/egress into enterprise contracts.

[04]

Storage Revenue Model & Margin Analysis

Storage revenue is 15-30% of total but captures 60-80% gross margin. Example 10MW cluster with $35M compute revenue at 80% utilisation, $5.00/hour: persistent storage from 200 customers averaging 1TB each = 200TB = $2.4-4.8M/year at $0.10-$0.20/GB-month, yielding $1.8-3.6M EBITDA at 75% margin. Egress from 200 customers, 50GB/month average = 120TB/year = $1.2-3.6M at $0.10-$0.30/GB, yielding $0.96-2.88M EBITDA at 80% margin.

Intra-cluster transfer (APIs, collective comms) is unchargeged operational cost ~$100-300K annually. Total storage EBITDA: $2.8-6.5M on $35M compute revenue. Storage adds 8-19% revenue uplift, driving overall EBITDA margin from 40-45% (compute alone) to 48-54%.

Margin sensitivity is high. Moving from 200 customers × 1TB to 300 customers × 2TB grows storage revenue $2.4-4.8M (7-14% increase) with minimal marginal cost. 10% storage revenue increase yields 15-20% EBITDA increase. This justifies capex in NVMe arrays, object stores, replication—even if short-term EBITDA dips.

CoreWeave emphasises deduplicated object storage and aggressive egress pricing. Lambda bundles storage in reserved discounts. Together AI treats storage as COGS to incentivise high-volume inference.

[05]

Storage Architecture for Training vs. Inference

Training and inference have distinct storage economics. Training requires high-throughput access to large datasets (ImageNet 150GB, LLM datasets 1-10TB), with repeated batch shuffling and frequent checkpointing. Prefers local NVMe or networked SSD for latency and bandwidth; egress is infrequent (final models, checkpoints only).

Storage margin is low: most data is ephemeral, persistent storage is modest checkpointing space, no egress revenue. Example: 100-GPU training cluster has 50GB local NVMe per GPU (temporary) + 500GB checkpointing = minimal persistent revenue. Inference requires low-latency weight access but small working set (weights + context + KV cache), request-driven patterns, heavy egress.

Object storage (S3-compatible) optimal for staging, caching, outputs; egress is primary revenue stream (80-85% margin). Example: 100-GPU inference cluster has 1-2TB persistent model cache ($100-400K/year) + 50TB monthly egress ($500K-$3M/year) = $600K-3.4M annual storage revenue. Workload segmentation optimises margin.

Training clusters minimise storage capex and pricing (cheap bulk datasets). Inference clusters maximise storage pricing and egress (data gravity lock-in). Mixed clusters compromise both: training subsidises cheap storage, inference forfeits egress markup. Industry trend: CoreWeave has training-optimised (high-bandwidth networking, bulk storage discounts) and inference-optimised (high-margin storage, egress pricing) clusters. AWS offers EC2 training (no storage upcharge) and SageMaker inference (bundled storage, egress fees).

Key Takeaways

Storage margin: 60-80% vs. compute 30-40%; storage revenue can represent 15-30% of total revenue while contributing 60-80% gross margin

Data gravity lock-in: egress fees ($0.10-$0.30/GB) create 1.5-20% switching cost for large multi-dataset customers; local caching saves 50-150x on repeated data access cost

Egress pricing strategy: offer competitive persistent storage ($0.15-$0.20/GB-month) to acquire customers, capture 20-30% margin uplift on egress ($0.20-$0.30/GB) after lock-in

Workload segmentation: training clusters minimise storage margin (focus on cheap bulk datasets), inference clusters maximise egress revenue (80-85% margin); mixed clusters compromise margin on both

Storage Economics & Margin Structure

Storage Tiers & Inference Workload Architecture

Egress & Data Gravity as Lock-in Mechanism

Storage Revenue Model & Margin Analysis

Storage Architecture for Training vs. Inference

GPU Total Cost of Ownership

GPU-as-a-Service Business Model

GPU Infrastructure Operating Expenses