Home Learn Docs API Docs

GPU Pricing: Spot vs Reserved vs Credit Marketplace

· By CompuX Team
On this page (27 sections)

GPU pricing determines how AI companies access compute and how providers monetize idle capacity. As of April 2026, an H100 SXM GPU costs between $2.99 and $6.15 per hour depending on provider, commitment level, and availability. This guide compares three pricing models -- spot instances, reserved capacity, and credit marketplace -- from both sides of the transaction: providers selling GPU time and startups buying it. Credit marketplaces offer a hybrid approach that combines the cost efficiency of spot pricing with the reliability of reserved capacity, typically delivering rates around 85% of spot without long-term lock-in.

Key Takeaways:

  • Spot pricing offers the lowest hourly rates but introduces capacity risk -- instances can be reclaimed mid-job with as little as 30 seconds notice, and prices swing +/-30% week over week.
  • Reserved capacity guarantees availability at a 20-40% premium over spot, but requires 1-3 year commitments that lock startups into hardware that may depreciate as newer GPUs enter the market.
  • Credit marketplaces deliver predictable pricing at roughly 85% of spot rates by aggregating demand across buyers and supply across providers, with no multi-year contracts.
  • For providers, credit marketplaces guarantee revenue on otherwise idle GPUs without the operational burden of managing thousands of individual spot customers.
  • For startups, compute credits eliminate the false choice between cheap-but-unreliable spot and expensive-but-guaranteed reserved capacity.

What Is Spot GPU Pricing?

Spot pricing lets cloud providers sell unused GPU capacity at variable, market-driven rates. When demand is low, spot prices drop -- sometimes to 60-70% below on-demand rates. When demand spikes, prices rise sharply and providers can reclaim instances from spot customers to serve higher-priority workloads.

The mechanism is straightforward: providers have fixed GPU infrastructure. Not every GPU is occupied at every moment. Rather than leave capacity idle, they auction it to price-sensitive buyers. AWS, GCP, Azure, Lambda Labs, CoreWeave, and dozens of smaller providers all offer some form of spot or preemptible GPU access.

Spot Pricing Volatility in Practice

Current spot market data (April 2026) shows wide price variation:

  • H100 SXM 80GB: $2.99-6.15/hr across providers, with intra-week swings of +/-30%
  • A100 80GB: $1.29-2.21/hr, somewhat more stable due to higher supply
  • H200 141GB: approximately $4.19/hr, limited availability driving tighter price ranges

The volatility is not random. GPU spot prices follow predictable patterns tied to model training cycles at large labs, quarterly budget flushes at enterprises, and new GPU shipment schedules. A startup running a 72-hour training job that starts at $3.10/hr for an H100 might see that rate climb to $4.50/hr midway through -- or lose the instance entirely if a reserved customer needs the capacity.

Who Spot Pricing Works For

Spot instances work well for fault-tolerant, checkpoint-friendly workloads: batch inference, hyperparameter sweeps, data preprocessing, and training runs with frequent checkpointing. They do not work well for production inference serving, real-time applications, or any workload where interruption means starting over from scratch.

The fundamental limitation is economic, not technical. Startups that architect their systems for spot interruption spend engineering time on checkpointing infrastructure, instance migration logic, and monitoring -- time that could go toward their core product. The effective cost of spot includes this engineering overhead, which rarely appears in simple price-per-hour comparisons.

What Is Reserved GPU Capacity?

Reserved capacity is the opposite end of the spectrum. Providers guarantee specific GPU resources for a fixed term -- typically 1 to 3 years -- at a set price. The buyer pays whether they use the capacity or not, but they never lose access and the price never changes.

The Reserved Pricing Premium

Reserved pricing typically runs 20-40% above average spot rates but below on-demand peak pricing. For an H100 SXM, a 1-year reserved commitment might lock in at $4.50/hr compared to a spot average of $3.80/hr. The premium buys two things: guaranteed availability and price predictability.

For large enterprises with steady-state GPU needs, reserved capacity makes financial sense. A company running production inference 24/7 across 64 H100s knows exactly what it will spend each month. The CFO can budget it. The engineering team never worries about capacity.

Why Reserved Capacity Hurts Startups

For startups, reserved capacity creates three problems:

Capital lock-in. A 1-year reservation for 8 H100s at $4.50/hr costs approximately $315,000. That capital is committed regardless of whether the startup pivots, loses a customer, or discovers its model runs fine on A100s. For a seed-stage company with $2M in the bank, that is 15% of runway locked into one line item.

Hardware depreciation risk. GPU generations turn over every 12-18 months. A startup that signs a 3-year H100 reservation in April 2026 may find that B200 GPUs offer 2-3x the inference throughput at comparable prices by 2028. The reservation becomes an anchor, not an asset.

Utilization mismatch. Startups have variable compute needs. Development sprints demand heavy GPU usage; product launches require inference scaling; quiet periods need almost nothing. Reserved capacity charges the same rate during all three phases. Industry data suggests that startups with reserved capacity achieve only 40-60% average utilization, meaning they effectively pay 1.7-2.5x the hourly rate for the compute they actually consume.

The Credit Marketplace Model

A credit marketplace sits between spot and reserved, solving the core problems of both. Providers sell compute credits representing GPU time at wholesale rates. Startups buy credits at a predictable price -- typically 85% of the prevailing spot rate -- without committing to multi-year terms or specific hardware.

The GPU marketplace aggregates supply from multiple providers, which means a startup's credits can be fulfilled by whichever provider has available capacity. This pooled supply model dramatically reduces the interruption risk that plagues individual spot instances while maintaining cost advantages through bulk purchasing power.

How Credit Pricing Works

CompuX's credit marketplace operates on a financing-backed model. When a startup needs $100,000 in compute, the credit multiplier mechanism turns that into $125,000-$150,000 in actual compute credits. The additional value comes from bulk purchasing discounts that CompuX negotiates across its provider network and passes through to buyers.

The pricing structure is simple: credits are denominated in GPU-hours at a fixed rate per GPU type. An H100 credit might cost $3.20/hr equivalent -- below the $3.80/hr spot average and far below the $4.50/hr reserved rate. The rate stays fixed for the credit's validity period (typically 90-180 days), giving startups cost predictability without multi-year lock-in.

Startups can use the CompuX pricing calculator to model their expected costs across spot, reserved, and credit marketplace scenarios based on their specific workload profile.

Why Credits Are Not Just Discounted Spot

The distinction matters. Discounted spot still carries interruption risk and price volatility. Credits are pre-purchased capacity commitments that providers honor because the revenue is already guaranteed. When a provider sells 10,000 H100-hours as credits through CompuX, that revenue is locked in. The provider allocates capacity accordingly, treating credit-backed workloads with the same priority as reserved customers.

This is why credit marketplace pricing can be lower than spot while offering higher reliability. The provider gets guaranteed revenue (solving their utilization problem), the startup gets predictable pricing and reliable capacity (solving their cost and risk problems), and the marketplace captures margin on the spread.

Provider Economics: Why Each Model Matters

GPU providers -- whether hyperscalers like AWS and GCP or independent operators like CoreWeave and Lambda -- face a fundamental business challenge: GPUs are expensive fixed assets that depreciate whether they are running workloads or sitting idle. An H100 server node costs $300,000-$400,000. Every hour it sits unused, the provider loses money.

Spot: Good for Utilization, Bad for Revenue Predictability

Spot pricing helps providers fill idle capacity, but it creates revenue volatility. A provider cannot forecast quarterly revenue when 30% of their GPU fleet is sold on a spot basis with prices swinging 30% weekly. This makes it difficult to finance new GPU purchases, hire operations staff, or plan data center expansion.

Reserved: Predictable but Limits Flexibility

Reserved contracts give providers predictable multi-year revenue -- the kind that banks will lend against for data center expansion. But reserved contracts limit pricing flexibility. If GPU costs fall (as they do with each new generation), the provider is locked into delivering capacity at rates that may not reflect current market conditions.

Credits: Guaranteed Revenue Without Long Lock-In

Credit marketplace sales give providers the revenue predictability of reserved contracts on shorter time horizons. A provider selling 50,000 GPU-hours of credits through CompuX knows that revenue is committed. They can plan capacity allocation accordingly. If market prices shift, they can adjust credit pricing for the next batch without being locked into multi-year terms.

For providers considering onboarding to the CompuX marketplace, the credit model also reduces customer acquisition cost. Instead of marketing to thousands of individual startups, the provider sells capacity in bulk to the marketplace. CompuX handles customer relationships, billing, and support.

Startup Economics: Cost Per Token Analysis

For AI startups, the relevant metric is not cost per GPU-hour but cost per useful output -- whether that is tokens generated, images rendered, or models trained. The pricing model affects this metric in ways that go beyond the hourly rate.

The True Cost of Spot

A startup running inference on spot H100s at $3.10/hr might seem to save 31% versus reserved at $4.50/hr. But factor in:

  • Interruption recovery: 5-15% of compute wasted on checkpoint restores and cold starts
  • Engineering overhead: 1-2 engineers spending 20% of their time on spot infrastructure
  • Latency spikes: P99 latency increases during instance migration, affecting user experience
  • Over-provisioning: Running 20% more instances than needed to absorb interruptions

The effective cost of spot, fully loaded, is typically 90-95% of on-demand -- not the 60-70% discount the raw hourly rate suggests.

The True Cost of Credits

Credit marketplace pricing at 85% of spot already accounts for the marketplace's margin. But the startup also gains:

  • Zero interruption overhead: No checkpointing infrastructure needed for marketplace-backed capacity
  • No engineering tax: Standard API integration via OpenAI-compatible SDK, no spot-specific code
  • Predictable budgeting: Fixed credit rates for 90-180 days
  • Multiplier effect: $1 of financing yields $1.25-$1.50 in credits through bulk discount pass-through

The effective cost per token on credit marketplace capacity runs 15-25% below spot when all factors are included. For a detailed breakdown of how credit pricing compares to buying directly from cloud providers, see the direct providers comparison. Startups running high-volume API workloads can also review the inference-heavy startups use case.

Comparison Table: Spot vs Reserved vs Credit Marketplace

Factor Spot Reserved Credit Marketplace
Hourly rate (H100) $2.99-6.15/hr $4.00-5.50/hr ~$3.20/hr (85% of spot avg)
Price predictability Low (+/-30% weekly) High (fixed 1-3 years) Medium-high (fixed 90-180 days)
Availability guarantee None (preemptible) Full (contractual SLA) High (pooled multi-provider supply)
Minimum commitment None 1-3 years None (pay-as-you-go credits)
Interruption risk High (30s-2min notice) None Low (provider-pooled fulfillment)
Capital efficiency High (pay only when used) Low (pay whether used or not) High (credits consumed on use)
Hardware lock-in None Yes (specific GPU type) None (credits portable across GPUs)
Best for Fault-tolerant batch jobs Steady-state production Variable workloads, growing startups
Provider revenue type Variable, unpredictable Fixed, multi-year Guaranteed, short-term
Effective cost (fully loaded) 90-95% of on-demand 100-120% of on-demand 75-85% of on-demand

The GPU market is entering a period of simultaneous supply expansion and demand acceleration. Both trends affect pricing model economics.

Supply Side: More GPUs, More Options

NVIDIA's B200 and GB200 NVL72 systems began shipping in volume in Q1 2026. AMD's MI325X is gaining traction in inference workloads. Intel's Gaudi 3 offers a lower-cost alternative for specific model architectures. This supply expansion is pushing older GPU generations (A100, H100) toward lower price points. H100 spot prices have declined approximately 25% from their 2024 peaks.

At the same time, new data center capacity from providers like CoreWeave, Lambda, Crusoe, and Voltage Park is adding millions of GPU-hours to the market. The total addressable supply of cloud GPU compute has roughly doubled since January 2025.

Demand Side: Growing Faster Than Supply

Despite supply growth, demand continues to outpace it. Enterprise AI adoption is accelerating, with inference workloads growing at 4-5x annually. Agentic AI systems that chain multiple model calls per user interaction are multiplying token consumption. Fine-tuning and continuous training cycles are becoming standard practice rather than one-time events.

The net effect: GPU prices are declining in absolute terms but the total market is expanding. A startup that needed 1,000 H100-hours per month in 2025 may need 5,000 hours per month in 2026 as their product scales. The pricing model they choose determines whether that 5x growth in compute is financially sustainable.

What This Means for Pricing Models

Falling per-unit GPU prices make reserved commitments riskier -- the hardware you locked in at 2025 prices may be 30% cheaper by 2027. Spot remains volatile but trends downward overall. Credit marketplaces benefit from both trends: they can renegotiate provider rates as supply increases while passing savings through to buyers, and their shorter commitment windows (90-180 days) reduce the depreciation risk that makes long-term reservations dangerous.

Frequently Asked Questions

What happens if a spot GPU instance is interrupted during a training run?

When a spot instance is reclaimed, the workload is terminated -- typically with 30 seconds to 2 minutes of warning depending on the provider. Any work since the last checkpoint is lost. For a training run without checkpointing, this can mean losing hours or days of compute. Credit marketplace capacity avoids this risk because providers treat credit-backed workloads as committed capacity rather than interruptible spot.

How much do reserved GPU instances actually save compared to on-demand?

Reserved instances typically save 30-50% versus on-demand list prices, but the savings are misleading for startups with variable workloads. If a startup with a 1-year H100 reservation averages only 50% utilization, the effective hourly rate doubles -- making it more expensive than on-demand. Reserved pricing only delivers real savings at sustained utilization above 70-80%, which most early-stage companies cannot guarantee.

Can compute credits be used across different GPU types and providers?

Yes. In a credit marketplace like CompuX, compute credits are denominated in standardized GPU-hours and can be redeemed across multiple GPU types (H100, A100, H200) and multiple providers in the network. This portability means a startup can start development on A100s, run benchmarks on H100s, and deploy inference on H200s -- all from the same credit balance through a single API.

Why would a GPU provider sell capacity through a credit marketplace instead of directly?

Providers benefit from guaranteed revenue and reduced customer acquisition costs. Selling 50,000 GPU-hours in bulk to a marketplace is operationally simpler than managing hundreds of individual customer relationships. The marketplace handles billing, support, and demand aggregation. For smaller providers especially, marketplace distribution can increase utilization from 60% to 85%+ without hiring a sales team. See the provider onboarding guide for details on joining the CompuX network.

Are GPU prices expected to keep falling in 2026-2027?

Per-unit GPU prices are declining as new hardware (B200, MI325X) enters the market and older generations (A100, H100) become more abundant. H100 spot prices have dropped roughly 25% from their 2024 peaks. However, total compute spending is rising because demand is growing faster than supply. Startups should plan for lower per-hour costs but higher total compute budgets as their workloads scale. Credit marketplaces help manage this tension by offering bulk discounts and financing that convert lower per-unit prices into even greater savings through the credit multiplier mechanism.